mysql - SQL 优化大或查询

Question

我正在尝试运行查询以查找两个表上的多个电话号码列之间的任何匹配项，并且它花费的时间太长（> 5 分钟），这是尽可能多地过滤数据。我已经将我可以从两个表中搜索的实际列分离到它们自己的表中，只是为了减少总行数。

这是来自我继承的遗留应用程序。

询问

select count(b.bid) 
from customers_with_phone c,buyers_orders_with_phone b 
where 
   (b.hphone=c.pprim or b.hphone=c.phome or b.hphone=c.pwork or b.hphone=c.pother) 
or (b.wphone=c.pprim or b.wphone=c.phome or b.wphone=c.pwork or b.wphone=c.pother) 
or (b.cphone=c.pprim or b.cphone=c.phome or b.cphone=c.pwork or b.cphone=c.pother) 
group by b.bid;

表

mysql> show columns from customers_with_phone;
+--------+---------+------+-----+---------+-------+
| Field  | Type    | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+-------+
| pnum   | int(11) | YES  |     | NULL    |       |
| pprim  | text    | YES  |     | NULL    |       |
| phome  | text    | YES  |     | NULL    |       |
| pwork  | text    | YES  |     | NULL    |       |
| pother | text    | YES  |     | NULL    |       |
+--------+---------+------+-----+---------+-------+

mysql> show columns from buyers_orders_with_phone;
+--------+------+------+-----+---------+-------+
| Field  | Type | Null | Key | Default | Extra |
+--------+------+------+-----+---------+-------+
| bid    | text | YES  |     | NULL    |       |
| hphone | text | YES  |     | NULL    |       |
| wphone | text | YES  |     | NULL    |       |
| cphone | text | YES  |     | NULL    |       |
+--------+------+------+-----+---------+-------+

解释

+----+-------------+-------+------+---------------+------+---------+------+-------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra                                        |
+----+-------------+-------+------+---------------+------+---------+------+-------+----------+----------------------------------------------+
|  1 | SIMPLE      | b     | ALL  | NULL          | NULL | NULL    | NULL |  8673 |   100.00 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | c     | ALL  | NULL          | NULL | NULL    | NULL | 75931 |   100.00 | Using where; Using join buffer               |
+----+-------------+-------+------+---------------+------+---------+------+-------+----------+----------------------------------------------+

我意识到这两个表都没有主键，因为这些只是我需要搜索的列，我从它们的原始表中提取了这些列。但是使用原始表需要更长的时间，因为要过滤的数据要多得多。

我还有其他与此类似的查询，它们可以处理更多数据，因此如果我可以在合理的时间内使这个查询工作，我可以让其他查询以类似方式工作。

score 0 · Accepted Answer

遗留查询很糟糕，对不起。它是完整的笛卡尔积。

数据结构无法有效处理此类查询。您在一个表中有 3 个字段，在另一个表中有 4 个字段，并尝试计算是否有任何一对匹配。

可能每个电话列的主键和键都可以改善此查询，但不确定，但它会使删除/插入/更新性能变差。

顺便说一句，您写了不可能按可空列索引。这是不正确的。

我只能相信激进的解决方案——改变数据结构或添加某种带有触发器的缓存机制。但这很难。

score 0 · Accepted Answer

A primary key is not a optimazation. What you need are non clustered index on your telephone text fields (one index per column). With these, you won't need to extract your data to seperate tables.

mysql - SQL 优化大或查询

2 回答 2

Related

Reference