5

我会尝试用一个例子更好地解释标题

表 1 示例

Id  text
1   lorem ipsum doe
2   foo bar lorem ipsum jhon
3   bla bla ipsum tommy

表 2 示例

Id  fullname  name  surname  keyword
1   jhon doe  jhon  doe      jhon
2   tom asd   tom   asd      tom
3   sam frf   sam   frr      sam

使用 like 或 regexp 的预期表结果?

fullname  count(*)
jhon doe  2
tom asd   1
sam frf   0

非常感谢!

4

2 回答 2

5

最简单的是使用 REGEXP。

SELECT fullname, 
       Count(t1.id) 
FROM   table1 t1 
       RIGHT JOIN table2 t2 
               ON t1.text REGEXP t2.keyword 
GROUP  BY fullname 

演示

我使用了 RIGHT 连接,这样你就可以得到 sam 的零(否​​则它会被淘汰)

于 2013-05-10T21:14:41.537 回答
3

用我的真实数据进行一些性能测试

t1 => 100,000 行并且还在增长

t2 => 207 行

测试 1

SELECT 
    t2.fullname,
    count(t1.id) AS total
FROM
    table_1 AS t1
        RIGHT JOIN
    table_2 AS t2 ON t1.text REGEXP t2.keyword
GROUP BY t2.fullname
ORDER BY total DESC

212 seconds

测试 2

SELECT 
    t2.fullname,
    count(t1.id) AS total
FROM
    table_1 AS t1
        RIGHT JOIN
    table_2 AS t2 ON t1.text LIKE CONCAT('%', t2.keyword, '%')
GROUP BY t2.fullname
ORDER BY total DESC

30 seconds

测试 3

SELECT 
    t2.fullname,
    count(t1.id) AS total
FROM
    table_1 AS t1
        RIGHT JOIN
    table_2 AS t2 ON t1.text LIKE lower(CONCAT('%', t2.name, '%')) AND t1.text LIKE lower(CONCAT('%', t2.surname, '%'))
GROUP BY t2.fullname
ORDER BY total DESC

32 seconds

测试 4

SELECT 
    t2.fullname,
    count(t1.id) AS total
FROM
    table_1 AS t1
        RIGHT JOIN
    table_2 AS t2 ON t1.text LIKE lower(CONCAT('%', t2.name, '%')) OR t1.text LIKE lower(CONCAT('%', t2.surname, '%'))
GROUP BY t2.fullname
ORDER BY total DESC

40 seconds

测试 5

SELECT 
    t2.fullname,
    count(t1.id) as total
FROM
    table_1 as t1
        RIGHT JOIN
    table_2 as t2 ON t1.text LIKE CONCAT('%', t2.keyword, '%') OR (t1.text LIKE lower(CONCAT('%', t2.name, '%')) AND t1.text LIKE lower(CONCAT('%', t2.surname, '%')))
GROUP BY t2.fullname
ORDER BY total DESC

41 seconds

我会选择测试 5。最佳折衷结果/性能

有什么进一步的建议吗?

再次感谢你的帮助!

于 2013-05-10T23:06:43.690 回答