1

我有一个有 4 个条目的表。

    CREATE TABLE tab( 
    name Text 
                    ); 

    INSERT INTO "tab" VALUES('Intertek');
    INSERT INTO "tab" VALUES('Pntertek');
    INSERT INTO "tab" VALUES('Ontertek');
    INSERT INTO "tab" VALUES('ZTPay');

Pntertek 和 Ontertek 是正确拼写的 Intertek 的模糊副本。我希望创建一个包含模糊重复项和正确拼写名称的列表。

因为我有 4 个名字,所以我有 4 个搜索条件:

    SELECT name FROM tab WHERE name LIKE '%ntertek' 
    AND (SELECT COUNT(*) FROM tab WHERE name LIKE '%ntertek') >1;
    SELECT name FROM tab WHERE name LIKE '%ntertek' 
    AND (SELECT COUNT(*) FROM tab WHERE name LIKE '%ntertek') >1;
    SELECT name FROM tab WHERE name LIKE '%ntertek' 
    AND (SELECT COUNT(*) FROM tab WHERE name LIKE '%ntertek') >1;
    SELECT name FROM tab WHERE name LIKE '%TPay' 
    AND (SELECT COUNT(*) FROM tab WHERE name LIKE '%TPay') >1;

这将生成 3 个包含相同信息的列表。如果第一个返回结果,我想忽略第二个和第三个相同的 SELECT 语句。这可能使用 SQLite 吗?我该怎么做?

在谈到 sqlite 和一般编程时,我是一个初学者,所以任何帮助都将不胜感激。

提前致谢。

4

1 回答 1

0

What do you want the query to return? Just potential duplicates? If so you could do the above with one query by including a having statement. However, the method that you are using at the moment only allows for differences at the start of the name. I would suggest looking into something like an edit-distance algorithm (sometimes referred to as Levenshtein distance) to identify the number of characters you would need to change on one field to make it the same as another.

There are details of a possible SQLite implementation in the following link: http://www.sqlite.org/spellfix1.html

于 2013-07-09T13:11:16.233 回答