2

我有几个表(产品和供应商),想找出哪些项目不再列在供应商表中。

表 uc_products 有产品。表 uc_supplier_csv 有供应商库存。uc_products.model 加入 uc_suppliers.sku。

尝试识别产品表中未在供应商表中引用的库存时,我看到很长的查询。我只想提取匹配的条目的nidsid IS NULL只是为了让我可以确定哪些项目没有供应商。

对于下面的第一个查询,数据库服务器(4GB 内存/2x 2.4GHz 英特尔)需要一个小时才能获得结果(507 行)。我没有等待第二个查询完成。

我怎样才能使这个查询更优化?是不是因为字符集不匹配?

我在想以下将是最有效的 SQL 使用:

         SELECT nid, sid 
           FROM uc_products p
LEFT OUTER JOIN uc_supplier_csv c
             ON p.model = c.sku
         WHERE sid IS NULL ;

对于此查询,我得到以下 EXPLAIN 结果:

mysql> EXPLAIN SELECT nid, sid FROM uc_products p LEFT OUTER JOIN uc_supplier_csv c ON p.model = c.sku WHERE sid IS NULL;
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra                   |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
|  1 | SIMPLE      | p     | ALL  | NULL          | NULL | NULL    | NULL |   6526 |                         | 
|  1 | SIMPLE      | c     | ALL  | NULL          | NULL | NULL    | NULL | 126639 | Using where; Not exists | 
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
2 rows in set (0.00 sec)

我会认为键 idx_sku 和 idx_model 在这里使用是有效的,但它们不是。那是因为表的默认字符集不匹配吗?一个是 UTF-8,一个是 latin1。

我也考虑过这种形式:

SELECT nid 
  FROM uc_products 
 WHERE model 
NOT IN ( 
         SELECT DISTINCT sku FROM uc_supplier_csv 
       ) ;

EXPLAIN 显示该查询的以下结果:

mysql> explain select nid from uc_products where model not in ( select sku from uc_supplier_csv ) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type        | table           | type  | possible_keys         | key     | key_len | ref  | rows   | Extra                    |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
|  1 | PRIMARY            | uc_products     | ALL   | NULL                  | NULL    | NULL    | NULL |   6520 | Using where              | 
|  2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258     | NULL | 126639 | Using where; Using index | 
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)

就这样我不会错过任何东西,这里有一些更令人兴奋的细节:表格大小和统计数据,以及表格结构:)

mysql> show table status where Name in ( 'uc_supplier_csv', 'uc_products' ) ;
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| Name            | Engine | Version | Row_format | Rows   | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Collation         | Checksum | Create_options | Comment |
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| uc_products     | MyISAM |      10 | Dynamic    |   6520 |             89 |      585796 | 281474976710655 |       232448 |       912 |           NULL | 2009-04-24 11:03:15 | 2009-10-12 14:23:43 | 2009-04-24 11:03:16 | utf8_general_ci   |     NULL |                |         | 
| uc_supplier_csv | MyISAM |      10 | Dynamic    | 126639 |             26 |     3399704 | 281474976710655 |      5864448 |         0 |           NULL | 2009-10-12 14:28:25 | 2009-10-12 14:28:25 | 2009-10-12 14:28:27 | latin1_swedish_ci |     NULL |                |         | 
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+

CREATE TABLE `uc_products` (
  `vid` mediumint(9) NOT NULL default '0',
  `nid` mediumint(9) NOT NULL default '0',
  `model` varchar(255) NOT NULL default '',
  `list_price` decimal(10,2) NOT NULL default '0.00',
  `cost` decimal(10,2) NOT NULL default '0.00',
  `sell_price` decimal(10,2) NOT NULL default '0.00',
  `weight` float NOT NULL default '0',
  `weight_units` varchar(255) NOT NULL default 'lb',
  `length` float unsigned NOT NULL default '0',
  `width` float unsigned NOT NULL default '0',
  `height` float unsigned NOT NULL default '0',
  `length_units` varchar(255) NOT NULL default 'in',
  `pkg_qty` smallint(5) unsigned NOT NULL default '1',
  `default_qty` smallint(5) unsigned NOT NULL default '1',
  `unique_hash` varchar(32) NOT NULL,
  `ordering` tinyint(2) NOT NULL default '0',
  `shippable` tinyint(2) NOT NULL default '1',
  PRIMARY KEY  (`vid`),
  KEY `idx_model` (`model`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 

CREATE TABLE `uc_supplier_csv` (
  `sid` int(10) unsigned NOT NULL default '0',
  `sku` varchar(255) default NULL,
  `stock` int(10) unsigned NOT NULL default '0',
  `list_price` decimal(8,2) default '0.00',
  KEY `idx_sku` (`sku`),
  KEY `idx_stock` (`stock`),
  KEY `idx_sku_stock` (`sku`,`stock`),
  KEY `idx_sid` (`sid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 

编辑:为下面来自 Martin 的几个建议查询添加查询计划:

mysql> explain SELECT nid FROM uc_products p WHERE NOT EXISTS ( SELECT 1 FROM uc_supplier_csv c WHERE p.model = c.sku ) ;
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type        | table | type  | possible_keys | key     | key_len | ref  | rows   | Extra                    |
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
|  1 | PRIMARY            | p     | ALL   | NULL          | NULL    | NULL    | NULL |   6526 | Using where              | 
|  2 | DEPENDENT SUBQUERY | c     | index | NULL          | idx_sku | 258     | NULL | 126639 | Using where; Using index | 
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)

mysql> explain SELECT nid FROM uc_products WHERE model NOT IN ( SELECT sku  FROM uc_supplier_csv ) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type        | table           | type  | possible_keys         | key     | key_len | ref  | rows   | Extra                    |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
|  1 | PRIMARY            | uc_products     | ALL   | NULL                  | NULL    | NULL    | NULL |   6526 | Using where              | 
|  2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258     | NULL | 126639 | Using where; Using index | 
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
4

1 回答 1

3

也许尝试使用 NOT EXISTS 而不是计数?例如:

SELECT nid 
  FROM uc_products p
 WHERE NOT EXISTS ( 
       SELECT 1 
         FROM uc_supplier_csv c
        WHERE p.model = c.sku
       )

SO 用户 Quassnoi 有一篇简短的文章概述了一些测试,这些测试表明这也可能值得一试:

SELECT nid 
  FROM uc_products
 WHERE model NOT IN ( 
       SELECT sku 
       FROM uc_supplier_csv
       )

基本上按照您的原始查询,没有 DISTINCTion。

另一个给你的克里斯,这次是在交叉编码连接的帮助下:

SELECT nid
  FROM uc_products p
 WHERE NOT EXISTS (
       SELECT 1
       FROM uc_supplier_csv c
       WHERE CONVERT( p.model USING latin1 )  = c.sku
       )
于 2009-10-12T07:26:47.477 回答