我有一个具有以下结构的表:
+-----+-------------------+
| ID | Name |
+-----+-------------------+
| 1 | abc |
+-----+-------------------+
| 2 | abc (duplicate) |
+-----+-------------------+
| 3 | bcd |
+-----+-------------------+
| 4 | bcd (duplicate) |
+-----+-------------------+
| 5 | bcd (duplicate) |
+-----+-------------------+
| 6 | efg |
+-----+-------------------+
| 7 | hij |
+-----+-------------------+
我需要计算每次Name
出现((duplicate)
包括在内),即:
+-------------------+--------+
| Name | Count |
+-------------------+--------+
| abc | 2 |
+-------------------+--------+
| bcd | 3 |
+-------------------+--------+
| efg | 1 |
+-------------------+--------+
| hij | 1 |
+-------------------+--------+
我想提一下,该Name
列实际上是有 typeTINYTEXT
的。里面会有很多行:5396已经处于测试模式。我试图通过TRIM(REPLACE(Name, '(duplicate)', ''))
分组进行表的自我连接:
SELECT
DISTINCT TRIM(REPLACE(`t`.`Name`, '(duplicate)', '')) as `name`,
COUNT(`s`.`ID`) as `count`
FROM
`Table` as `t` INNER JOIN `Table` as `s` ON
TRIM(REPLACE(`t`.`Name`, '(duplicate)', '')) LIKE TRIM(REPLACE(`s`.`Name`, '(duplicate)', ''))
GROUP BY 1;
而且...好吧,我的开发机器上的结果是 4846 行,耗时122.62秒(?!)。
Q1:这是一个正确的方法吗?
Q2:有什么方法可以让它更快?
任何帮助,将不胜感激。