3

I very often search the table posts for values in the columns user+status and user+time.

SELECT * FROM `posts` WHERE `user`='xxx' and `status`='active'
SELECT * FROM `posts` WHERE `user`='xxx' and `time`>...

Thus I have set up two indices (user, status) and (user, time)

I'm aware, that writing processes are slowed down the more indices need to be updated. But I think in this case it is useful to have both indices, since reading operations outnumber writing operations by far.

Anyway, PHPMyAdmin gives a Warning saying "More than one index has been created for the column user". Can I just ignore this warning? I checked the Wordpress DB tables and saw that they have put a column at the second position, if it already had an index.

comment_approved_date_gmt = INDEX(comment_approved, comment_date_gmt)
comment_date_gmt = INDEX(comment_date_gmt)

Why don't they use only one two column index (INDEX(comment_date_gmt, comment_approved)), that would save INDEX(comment_date_gmt)? and why is it disadvantageous to have two indices starting with the same column-name?

Is there a general rule, which column should go first in my query? For example the one with the lowest number of different entries (e.G. status) and afterwards the one with a higher number of different values (e.g. user names)

4

2 回答 2

3

是的,索引中列的顺序很重要。

想一个电话簿的类比。这就像 (last_name, first_name) 上的索引。按姓氏查找一个人,您可以使用电话簿的排序顺序来帮助您快速找到他们。

但如果你只知道这个人的名字,他们就会散布在整本书中。要找到一本,您必须逐页搜索这本书。

是的,索引可能是多余的。

任何搜索 last_name 的查询都可以在 (last_name) 上使用单列索引,或者它可以从 (last_name, first_name) 上的两列索引中获得相同的好处。那么为什么要创建两个索引呢?

有一个工具pt-duplicate-key-checker可以帮助您识别冗余索引。我从来没有遇到过没有至少几个这样的索引的数据库。

phpMyAdmin 是错误的。

如果 phpMyAdmin 对索引 (user, status) 和 (user, time) 发出警告,那么它就过于热心了,因为这些索引彼此之间不是冗余的。基本上,如果一个索引的列包含另一个索引中列的左前缀,则该索引是冗余的。因此,索引 (A) 相对于索引 (A, B) 是冗余的,但索引 (A, C) 与 (A, B) 不同,并且两者都可能被不同的查询使用。

PS:我在演示如何设计索引中涵盖了这些要点,真的

于 2013-01-03T19:20:22.747 回答
0

我觉得 SQL 查询中的列排序是一种过早的优化,根据 Knuth 的说法,这是万恶之源。您应该为维护而不是优化而编程,并让优化器负责速度。

于 2013-01-03T19:15:54.127 回答