-1

Here's the problem:

I have a table in PostgreSQL with adresses in plain text and tsvectors. And i'm trying to find an adress record in a query like this.

SELECT * FROM address_catalog
WHERE address_catalog.search_vector @@ to_tsquery('123456:* & Klingon:* & Empire:* & Kronos:* & city:* & Matrok:* & street:* & 789:*')

But the problem is that I don't know anything about the adress in a query. I can't define where a country, a city or a street is in the incoming string. I don't know what order of words the adress has, or does it contain extra words.

I can only search for countries and cities, but if the incoming string contains street, index or anything else, the search returns nothing because of the conjunction of all vector tokens. At the same time, I simply can't delete some string parts or use disjunction, because I never know where in the string the extra words are.

So, is there any way to construct a tsquery to return some best matches for the incoming string? Or maybe partial matches? When i tried to force it to use OR instead of AND everywhere in tsquery, it returned me nearly the whole database. I need vectors intersection... in postgresql.

4

1 回答 1

1

我建议为此使用smlar (PDF) 扩展。它是由编写文本搜索的同一个人编写的。它允许您使用TF-IDF相似性度量,该度量允许“无关”的查询词

下面是如何编译它(我还没有弄清楚如何在 Windows 上编译它):

http://blog.databasepatterns.com/2014/07/postgresql-install-smlar-extension.html

以下是如何使用它:

http://blog.databasepatterns.com/2014/08/tf-idf-text-search-in-postgres.html

于 2016-04-29T18:47:16.570 回答