0

I'm working on a query to find records in a table of new entries that match records in a table of historical entries, where the match could be on one of many fields. In other words:

"Show all records where current.id = archive.id or current.name = archive.name or current.address = archive.address"

My SQL for this query is as follows:

SELECT current.id, current.name, current.address FROM current
INNER JOIN archive
ON
    current.id = archive.id OR
    current.name = archive.name OR
    current.address = archive.address

When I run it, it takes FOREVER, and this is on the first load of data; archive will always have around 300,000 records in it, but current will fluctuate between 500 and 40,000.

Is there a better way to write this query? Or, is my query solid, but my underlying database potentially in trouble?

4

1 回答 1

4

Creating an index on the 3 fields in question in each table would probably help (especially on the archive table if it is very large), but try this instead:

SELECT current.id, current.name, current.address
FROM current
INNER JOIN archive
ON
    current.id = archive.id

UNION

SELECT current.id, current.name, current.address
FROM current
INNER JOIN archive
ON
    current.name = archive.name

UNION 

SELECT current.id, current.name, current.address
FROM current
INNER JOIN archive
ON
    current.address = archive.address

This query would allow you to index the fields individually (which you should still do), resulting in potentially smaller indexes and better overall performance.

Using OR's in join criteria can really mess up the query optimizer, potentially making it do suboptimal things. The UNIONs are expensive, but it is more likely that your query time is spent on the join, and simplifying that may help a lot.

于 2012-05-03T20:24:24.983 回答