13

After reading some stuff about indices on SQL Server and their performance advantages for selects and disadvantages for updates / inserts, i was wondering if badly used indices could actually also hurt performance for selects. What conditions would have to be fulfilled to have an index decrease performance of a pure select query? Do such situations exist?

Thanks!

(although I always try to include code examples, i can't think of anything that would support this question...)

4

5 回答 5

15

Yes, albeit very slightly - so slightly that it would be justified to also answer "No".

If you have an index which might be considered for a query, but is not useable, the optimizer will waste a short time pondering whether and how to use it (in rare cases with REALLY complicated indexes and views, and more frequently when index performance hints are wrong, you might end up choosing a suboptimal query plan).

Some cases would be:

  • a table without indexes
  • a table with a badly chosen index, which gets discarded
  • a table where TWO indexes exist, and for some reason (e.g. obsolete statistics), the existence of the second index makes the optimizer choose it, while it would have been more convenient to use the first.
  • a table where the existing index (usually also thanks to obsolete statistics) tricks the optimizer into reading from the index an amount of data comparable to what could have been, more efficiently, retrieved with a full table scan; to make things worse, the index is fragmented and hashed differently than the table. What was essentially a full table scan becomes a slowed down full table scan with lots of disk thrashing.

In the first two cases the query time is the same (and entails a full scan), but in the third, you also have to analyze and discard the index. In the fourth, unlikely but possible, case an execution time which is likely very large increases and becomes huge (update 2021-10-20: I have just done this to myself. Yay me).

Where an index is likelier to hurt you - where ALL indexes hurt you - is in inserts, deletes and updates. Then, any index not used by the update query, yet affected by same, will require a write to the index itself.

So you will want to have indexes, but as few as you can without sacrificing SELECT performances. Actually, you might decide against indexing for a rarely used SELECT query in order to avoid having the needed index constantly updated by all other UPDATE queries.

Edit: after reading Heinzi's answer, I'd also like to add that most DB servers have maintenance tools which analyze the tables and indexes (and sometimes query performance counters too), and properly update the hints of which Heinzi spoke. So it's also important to periodically "maintain" the database to keep the optimizer supplied with up-to-date information on which indexes to choose from.

Update (MySQL)

There is a very nifty MySQL analysis tool that can actually suggest improvements to the existing indexing (remove unused keys, add useful keys): common_schema. It's really worth a look.

于 2012-07-06T11:51:22.207 回答
4

Yes, but it's very unlikely and it should not influence your decision to use indexes.

Sometimes, the SQL Server query analyzer chooses the an execution plan that's not optimal. Since the number of possible execution plans is much larger than it might seem on first sight (a simple join of n tables already produces n! possible execution plans), SQL Server has to make an educated guess. It's in the nature of guesses that they are sometimes wrong.

It's a rare occurrence, but I've seen it happen a couple of times in the past years. In that case (and only in that case), a better plan would have been chosen if the index had not been there. However, removing the index is not the correct way to solve this problem, since the index usually exists for a reason. The correct way is to add a hint to this query (and only to this query), to help the optimizer choose the right plan.

于 2012-07-06T11:59:11.407 回答
4

Yes, indexes can hurt performance for SELECTs. It is important to understand how database engines operate. Data is stored on disk(s) in "pages". Indexes make it possible to access the specific page that has a specific value in one or more columns in the table.

This is great if you are looking for specific values.

However, consider a query that needs to look at every row in a table. If you go through the table, you read the pages in order and -- critically -- you get every row on the page with a single read. The number of reads is the number of pages in the table. In addition, the page cache can optimize the reads with look-ahead reads and pages no longer being used are simply overwritten.

Using an index for the same reads goes through the table one record at a time rather than one page at a time. This results in random reads through the pages. In the worst case, there is one read per record in the table -- potentially a very significant hit to performance. In addition, the index itself occupies some of the page cache, reducing memory for other operations.

In generally, the optimizer component of a SQL engine does a good job distinguishing between these two situations. One of the key metrics is the selectivity of the query. How many rows is the query returning (which the optimizer looks at with respect to the number of pages)? If the number of rows is about the same as the number of pages, the optimizer would consider a full table scan rather than an index scan.

There are definitely other considerations, but in general, an index can hurt performance of even a simple select query. In general, optimizers do a good job, but there are sometimes unusual cases that trick even the best optimizers.

于 2012-07-06T13:44:10.373 回答
1

My guess would be if you create indices that confuse the query plan optimiser, and that ends up choosing an inefficient index for the query at hand.

于 2012-07-06T11:44:26.477 回答
1

This is potentially implementation-dependent, but in principle indexes should not slow down SELECT.

Obviously they can slow down INSERT and UPDATE.

于 2012-07-06T11:49:54.150 回答