我很难弄清楚需要索引什么才能使我的查询尽可能高效。使用的表有数十亿行,所以没有索引它是无用的。
我知道当我搜索WHERE ... AND
这些列应该一起索引的东西时,但我不明白索引如何应用于更复杂的情况,比如COUNT
和ORDER BY
。
请有人告诉我以下查询需要哪些索引:
查询一:
SELECT word1,word2,COUNT(id) AS aaa
FROM mytable
WHERE (word1>0 AND word2=429907) OR (word1=429907 AND word2>0)
GROUP BY word1,word2
ORDER BY aaa DESC LIMIT 20;
查询 2:
CREATE TEMPORARY TABLE temptbl (
pibn INT UNSIGNED NOT NULL, page SMALLINT UNSIGNED NOT NULL)
ENGINE=MEMORY;
INSERT INTO temptbl (
SELECT DISTINCT pibn,page FROM mytable
WHERE word1=429907 AND word2=0);
ALTER TABLE temptbl ADD PRIMARY KEY (pibn,page);
SELECT word1,word2,COUNT(id) AS aaa
FROM mytable a
INNER JOIN temptbl b
ON a.pibn=b.pibn AND a.page=b.page
GROUP BY word1,word2 ORDER BY aaa DESC LIMIT 10;
DROP TABLE temptbl;
查询 3:
SELECT pibn,COUNT(*) AS aaa
FROM mytable
WHERE word1=429907 AND word2=12322
GROUP BY pibn ORDER BY aaa DESC LIMIT 25
目前的指标是:
id
pibn,page
word1,word2,origyear,cat
就目前而言(使用当前索引)查询 1 需要 13 秒,查询 2 需要 35 秒,查询 3 需要 0.1 秒(这听起来很快,但我认为它没有得到尽可能多的优化。)