1

我创建了下面的查询,它显示价格以及同一个表中过去价格的一种增量索引(实际查询在不同的日期间隔使用多个子查询,因此我更喜欢避免使用多个 JOIN):

SELECT
H1.`item_id`,
H1.`date`,
H1.`price`,
(SELECT AVG(H2.price)/H1.`price`
    FROM hive_item_price H2 FORCE INDEX (date_id)
    WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
    AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +12 hour) AND H1.`date`) AS fDelta12hrs,
(SELECT AVG(H2.price)/H1.`price`
    FROM hive_item_price H2 FORCE INDEX (date_id)
    WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
    AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +48 hour) AND H1.`date`) AS fDelta48hrs
FROM hive_item_price H1
WHERE H1.id = 3915328

它工作得很好,但我不得不强制使用 INDEX,因为 MySQL 不使用它,它使它非常慢。一旦我在 WHERE 子句中指定超过 1 行(即“WHERE H1.id IN (3915328,3915044)”VS “WHERE H1.id = 3915328”),问题就开始了。

...
WHERE H1.id IN (3915328,3915044)

它改变了查询计划并且变得非常非常慢(就像 1 VS 10000 的比率!)。索引接缝被错误使用。我的目标是以百万的价格运行这个:)。我使用了解释功能,但无法弄清楚如何获得类似的查询计划或只是类似的性能。

以下是快速运行查询的计划(仅 1 行使用“WHERE H1.id = 3915328”):

| id | select_type          | table | type  | possible_keys | key     | key_len | ref   | rows | Extra
|1   | PRIMARY              | H1    | const | PRIMARY       | PRIMARY | 8       | const | 1    |
|2   | DEPENDENT SUBQUERY   | H2    | range | date_id       | date_id | 16      | {null}| 61   | Using where

这是从“WHERE H1.id = 3915328”更改为“WHERE H1.id IN (3915328,3915044)”时的新计划:

| id | select_type        | table | type  | possible_keys | key     | key_len  | ref                 | rows   | Extra
| 1  | PRIMARY            | H1    | range | PRIMARY       | PRIMARY | 8        | {null}               | 2     | Using where
| 2  | DEPENDENT SUBQUERY | H2    | ref   | date_id       | date_id | 8        | tvlr_old.H1.item_id | 19578 | Using where

数据如下所示:

id      item_id price date
3915328 4       94,00 21/06/2013 10:24:03
3915044 4       93,00 21/06/2013 10:12:03
3914761 4       92,00 21/06/2013 10:00:03
3914475 4       92,00 21/06/2013 09:48:03
3914189 4       91,00 21/06/2013 09:36:03
3913905 4       91,00 21/06/2013 09:24:03
3913620 4       91,00 21/06/2013 09:12:03
3913335 4       90,00 21/06/2013 09:00:03
3913050 4       90,00 21/06/2013 08:48:03
3912764 4       90,00 21/06/2013 08:36:03

谢谢你的帮助。

4

2 回答 2

0

考虑到仅针对 1 行/id 的查询版本比 2+ 行/ids 版本快 1000 多倍,并且在这种情况下我无法避免 MySQL 的错误查询计划:我目前最快的解决方案found for multiple ids/rows 是使用一个游标,它将为每个 id 运行 1 行查询。

DROP TABLE IF EXISTS tempPrices;
CREATE TEMPORARY TABLE tempPrices
(
  iId INT unsigned NOT NULL,
  dDateCollected datetime,
  fPrice FLOAT,
  fDelta12hrs FLOAT,
  fDelta48hrs FLOAT
)ENGINE=MEMORY;

DROP PROCEDURE IF EXISTS pricefcloop;

CREATE PROCEDURE pricefcloop()
BEGIN
  DECLARE curr_id INT;
  DECLARE cur1 CURSOR FOR SELECT id FROM hive_item_price WHERE id IN (3915328, 3915044, ....);

  OPEN cur1;

  read_loop: LOOP
    FETCH cur1 INTO curr_id;
    INSERT INTO tempPrices (iId, dDateCollected, fPrice, fDelta12hrs, fDelta48hrs)
      SELECT
      H1.`item_id`,
      H1.`date`,
      H1.`price`,
      (SELECT AVG(H2.price)/H1.`price`
          FROM hive_item_price H2 FORCE INDEX (date_id)
          WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
          AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +12 hour) AND H1.`date`) AS fDelta12hrs,
      (SELECT AVG(H2.price)/H1.`price`
          FROM hive_item_price H2 FORCE INDEX (date_id)
          WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
          AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +48 hour) AND H1.`date`) AS fDelta48hrs
      FROM hive_item_price H1
      WHERE H1.id = curr_id;
  END LOOP;

  CLOSE cur1;
END;

CALL pricefcloop();

SELECT * FROM tempPrices;
于 2013-11-15T14:09:56.270 回答
0

你能试试这个查询吗?出于好奇:

SELECT 
    H1.id,
    AVG(H2.price)/H1.`price` AS fDelta48
    AVG(H3.price)/H1.`price` AS fDelta24
FROM 
    hive_item_price H1
        JOIN hive_item_price H2 ON 
                H2.item_id = H1.item_id 
            AND H2.bee_hive_id = H1.bee_hive_id
            AND H2.date BETWEEN DATE_SUB(H1.`date`, INTERVAL +48 HOUR) AND H1.`date`
        JOIN hive_item_price H3 ON 
                H3.item_id = H1.item_id 
            AND H3.bee_hive_id = H1.bee_hive_id
            AND H3.date BETWEEN DATE_SUB(H1.`date`, INTERVAL +24 HOUR) AND H1.`date`
WHERE 
    H1.id IN (3915328, 3915044)
GROUP BY
    H1.id;
于 2013-11-14T22:54:03.380 回答