1

我需要填补一个mysql查询结果集的时间序列的空白。我正在测试使用包含时间序列的所有数据点的辅助表进行外部联接的选项(如本线程所示:如何填补 MySQL 中的日期空白?)。

我遇到的问题是添加此连接会显着增加查询响应时间(从 1 秒以下到 90 秒)。

这是原始查询:

select date_format(fact_data7.date_collected,'%Y-%m') as date_col
   , date_format(fact_data7.date_collected,'%d-%H:%i:%s') as time_col
   , fact_data7.batch_id,fact_data7.value as fdvalue,entities.ticker as ticker
   , date_format(fact_data7.date_collected,'%Y-%m-%d') as date_col2
   , date_format(fact_data7.date_collected,'%Y') as year 
from fact_data7  
JOIN entities on fact_data7.entity_id=entities.id  
where (1=1)
  AND ((entities.id= 963
      AND fact_data7.metric_id=1
      ))
  AND date_format(fact_data7.date_collected,'%Y-%m') > '2008-01-01'
order by date_col asc

这是添加了辅助表(month_fill)的外部连接的查询:

select date_format(month_fill.date,'%Y-%m') as date_col
    , date_format(fact_data7.date_collected,'%d-%H:%i:%s') as time_col
    , fact_data7.batch_id,fact_data7.value as fdvalue
    , entities.ticker as ticker
    , date_format(fact_data7.date_collected,'%Y-%m-%d') as date_col2
    , date_format(fact_data7.date_collected,'%Y') as year 
from fact_data7
JOIN entities
  on fact_data7.entity_id=entities.id  
RIGHT OUTER JOIN month_fill
   on date_format(fact_data7.date_collected,'%Y-%m') =  date_format(month_fill.date,'%Y-%m')  
where (1=1)
  AND (
      (entities.id= 963 AND fact_data7.metric_id=1)
      OR (entities.id is null and fact_data7.metric_id is null)
      )
  AND date_format(month_fill.date,'%Y-%m') > '2008-01-01'
order by date_col asc

我可以重组查询以提高性能是否有替代解决方案来实现我正在寻找的东西?

11/15 更新:

这是第一个查询的 EXPLAIN 输出:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  entities    const   PRIMARY     PRIMARY     4   const   1   Using filesort
1   SIMPLE  fact_data7  ALL     NULL    NULL    NULL    NULL    230636  Using where

这是第二个查询的 EXPLAIN 输出:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  month_fill  index   NULL    date    8   NULL    204     Using where; Using index; Using temporary; Using filesort
1   SIMPLE  fact_data7  ALL     NULL    NULL    NULL    NULL    230636  Using where
1   SIMPLE  entities    eq_ref  PRIMARY     PRIMARY     4   findata.fact_data7.entity_id    1   Using where
4

2 回答 2

0

甚至不考虑重构查询,我将首先在日期列 fact_data7.data_collected 和 month_fill.date 上添加索引。您正在执行的范围查询“>”正在减慢进程,添加索引理论上应该会提高性能,但您需要足够的记录,否则管理索引只会因为管理索引所涉及的处理而减慢。

请参阅此 mysql 文档http://dev.mysql.com/doc/refman/5.0/en/optimization-indexes.html

我不确定您要实现什么,但您可以尝试使用ifnull(value1,value2)mysql 的功能来实现。您的查询可能类似于以下内容:

select ifnull(date_format(fact_data7.date_collected,'%Y-%m'),date_format(month_fill.date,'%Y-%m')) as date_col, 
date_format(fact_data7.date_collected,'%d-%H:%i:%s') as time_col, 
fact_data7.batch_id,
fact_data7.value as fdvalue,
entities.ticker as ticker,
date_format(fact_data7.date_collected,'%Y-%m-%d') as date_col2 ,
date_format(fact_data7.date_collected,'%Y') as year 
from fact_data7 , month_fill
JOIN entities on fact_data7.entity_id=entities.id  
where ((entities.id= 963 AND fact_data7.metric_id=1) OR (entities.id is null and fact_data7.metric_id is null))
and date_format(fact_data7.date_collected,'%Y-%m') =  date_format(month_fill.date,'%Y-%m') --you will need a condition similar to this depends on the data
AND date_format(fact_data7.date_collected,'%Y-%m')>'2008-01-01'
order by date_col asc
于 2011-11-14T19:03:58.090 回答
0

我认为值得尝试重写where以便不使用date_format(date_collected). 你说你在这个字段上有一个索引,但它从未使用过(字段是函数的参数,MySQL不支持基于函数的索引)

于 2011-11-15T13:37:33.570 回答