0

我有一个带有一score列的产品表,上面有一个 B-Tree 索引。我有一个查询,它返回当前会话中未向用户显示的产品。我不能简单地使用简单的分页LIMIT,因为结果应该由score列排序,这可以在查询调用之间改变。

我目前的解决方案是这样的:

SELECT * 
FROM products p 
LEFT JOIN product_seen ps 
  ON (ps.session_id = ? AND p.product_id = ps.product_id )
WHERE ps.product_id is null
ORDER BY p.score DESC
LIMIT 30;

这对于前几页效果很好,但响应时间与会话中已显示的产品数量呈线性增长,并在该数量达到约 300 时达到第二个标记。有没有办法在 MySQL 中解决这个问题?或者我应该以完全不同的方式解决这个问题?


编辑: 这是两个表:

CREATE TABLE `products` (
 `product_id` int(15) NOT NULL AUTO_INCREMENT,
 `shop` varchar(15) NOT NULL,
 `shop_id` varchar(25) NOT NULL,
 `shop_category_id` varchar(20) DEFAULT NULL,
 `shop_subcategory_id` varchar(20) DEFAULT NULL,
 `shop_designer_id` varchar(20) DEFAULT NULL,
 `shop_designer_name` varchar(40) NOT NULL,
 `created_at` timestamp NULL DEFAULT NULL,
 `product_url` varchar(255) NOT NULL,
 `name` varchar(255) NOT NULL,
 `description` mediumtext NOT NULL,
 `price_cents` int(10) NOT NULL,
 `list_image_url` varchar(255) NOT NULL,
 `list_image_height` int(4) NOT NULL,
 `ending` timestamp NULL DEFAULT NULL,
 `category_id` int(5) NOT NULL,
 `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `included_at` timestamp NULL DEFAULT NULL,
 `hearts` int(5) NOT NULL,
 `score` decimal(10,5) NOT NULL,
 `rand_field` decimal(16,15) NOT NULL,
 `last_score_update` timestamp NULL DEFAULT NULL,
 `active` tinyint(1) NOT NULL DEFAULT '0',
 PRIMARY KEY (`product_id`),
 UNIQUE KEY `unique_shop_id` (`shop`,`shop_id`),
 KEY `score_index` (`active`,`score`),
 KEY `included_at_index` (`included_at`),
 KEY `active_category_score` (`active`,`category_id`,`score`),
 KEY `active_category` (`active`,`category_id`,`product_id`),
 KEY `active_products` (`active`,`product_id`),
 KEY `active_rand` (`active`,`rand_field`),
 KEY `active_category_rand` (`active`,`category_id`,`rand_field`)
) ENGINE=InnoDB AUTO_INCREMENT=55985 DEFAULT CHARSET=utf8

CREATE TABLE `product_seen` (
 `seenby_id` int(20) NOT NULL AUTO_INCREMENT,
 `session_id` varchar(25) NOT NULL,
 `product_id` int(15) NOT NULL,
 `last_seen` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `sorting` varchar(10) NOT NULL,
 `in_category` int(3) DEFAULT NULL,
 PRIMARY KEY (`seenby_id`),
 KEY `last_seen_index` (`last_seen`),
 KEY `session_id` (`session_id`,`seenby_id`),
 KEY `session_id_2` (`session_id`,`sorting`,`seenby_id`)
) ENGINE=InnoDB AUTO_INCREMENT=17431 DEFAULT CHARSET=utf8


编辑2:
上面的查询是一个简化,这是真正的查询EXPLAIN

EXPLAIN SELECT 
    DISTINCT p.product_id AS id, 
    p.list_image_url AS image, 
    p.list_image_height AS list_height, 
    hearts, 
    active AS available, 
    (UNIX_TIMESTAMP( ) - ulp.last_action) AS last_loved
FROM `looksandgoods`.`products` p
LEFT JOIN `looksandgoods`.`user_likes_products` ulp 
ON ( p.product_id = ulp.product_id AND ulp.user_id =1 )
LEFT JOIN `looksandgoods`.`product_seen` sb 
ON (sb.session_id = 'y7lWunZKKABgMoDgzjwDjZw1' 
    AND sb.sorting = 'trend'
    AND p.product_id = sb.product_id )
WHERE p.active =1
AND sb.product_id IS NULL
ORDER BY p.score DESC
LIMIT 30 ;


解释输出,仍然有一个临时表和文件排序,虽然连接的键存在:

+----+-------------+-------+-------+----------------------------------------------------------------------------------------------------+------------------+---------+----------------------------------+------+----------------------------------------------+
| id | select_type | table | type  | possible_keys                                                                                      | key              | key_len | ref                              | rows | Extra                                        |
+----+-------------+-------+-------+----------------------------------------------------------------------------------------------------+------------------+---------+----------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | p     | range | score_index,active_category_score,active_category,active_products,active_rand,active_category_rand | score_index      | 1       | NULL                             | 2299 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | ulp   | ref   | love_count_index,user_to_product_index,product_id                                                  | love_count_index | 9       | looksandgoods.p.product_id,const |    1 |                                              |
|  1 | SIMPLE      | sb    | ref   | session_id,session_id_2                                                                            | session_id       | 77      | const                            |  711 | Using where; Not exists; Distinct            |
+----+-------------+-------+-------+----------------------------------------------------------------------------------------------------+------------------+---------+----------------------------------+------+----------------------------------------------+
4

1 回答 1

1

新答案

我认为真正查询的问题是DISTINCT子句。product_seen这意味着和表中的一个或两个可以为每个可能出现在结果集中的user_likes_products多行连接(考虑到表上缺少 s有点令人不安),这就是您包含该子句的原因。不幸的是,这也意味着 MySQL 必须创建一个临时表来处理查询。product_idUNIQUE KEYproduct_seenDISTINCT

在我走得更远之前,如果可以的话...

ALTER TABLE product_seen ADD UNIQUE KEY (session_id, product_id, sorting);

...和...

ALTER TABLE user_likes_products ADD UNIQUE KEY (user_id, product_id);

...那么该DISTINCT子句是多余的,删除它应该可以解决问题。注意我并不是建议您一定需要添加这些键,而只是为了确认这些字段始终是唯一的。

如果不可能,那么可能还有另一种解决方案,但我需要更多地了解连接中涉及的表。

旧答案

一个EXPLAIN为您的查询产生...

+----+-------------+-------+------+---------------+------------+---------+-------+------+-------------------------+
| id | select_type | table | type | possible_keys | key        | key_len | ref   | rows | Extra                   |
+----+-------------+-------+------+---------------+------------+---------+-------+------+-------------------------+
|  1 | SIMPLE      | p     | ALL  | NULL          | NULL       | NULL    | NULL  |   10 | Using filesort          |
|  1 | SIMPLE      | ps    | ref  | session_id    | session_id | 27      | const |    1 | Using where; Not exists |
+----+-------------+-------+------+---------------+------------+---------+-------+------+-------------------------+

...这表明它没有在表上使用索引products,所以它必须进行表扫描和文件排序,这就是它很慢的原因。

我注意到有一个索引(active, score),您可以通过将查询更改为仅显示活动产品来使用它...

SELECT *
FROM products p
LEFT JOIN product_seen ps
  ON (ps.session_id = ? AND p.product_id = ps.product_id )
WHERE p.active=TRUE AND ps.product_id is null
ORDER BY p.score DESC
LIMIT 30;

...这将更EXPLAIN改为...

+----+-------------+-------+-------+-----------------------------+-------------+---------+-------+------+-------------------------+
| id | select_type | table | type  | possible_keys               | key         | key_len | ref   | rows | Extra                   |
+----+-------------+-------+-------+-----------------------------+-------------+---------+-------+------+-------------------------+
|  1 | SIMPLE      | p     | range | score_index,active_products | score_index | 1       | NULL  |   10 | Using where             |
|  1 | SIMPLE      | ps    | ref   | session_id                  | session_id  | 27      | const |    1 | Using where; Not exists |
+----+-------------+-------+-------+-----------------------------+-------------+---------+-------+------+-------------------------+

...现在正在进行范围扫描并且没有文件排序,这应该更快。

或者,如果您希望它也返回非活动产品,那么您score只需要添加一个索引,使用...

ALTER TABLE products ADD KEY (score);
于 2013-04-17T11:25:39.403 回答