SELECT * FROM (
SELECT a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name
FROM current_tbl a
INNER JOIN import_tbl b
ON ( a.user_id = b.user_id )
UNION
SELECT a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name
FROM current_tbl a
INNER JOIN import_tbl b
ON ( lower(a.f_name)=lower(b.f_name)
AND lower(a.l_name)=lower(b.l_name) )
) foo
--
UNION
--
SELECT a.user_id , a.f_name , a.l_name , '' , '' , ''
FROM current_tbl a
WHERE a.user_id NOT IN (
select user_id from(
SELECT a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name
FROM current_tbl a
INNER JOIN import_tbl b
ON ( a.user_id = b.user_id )
UNION
SELECT a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name
FROM current_tbl a
INNER JOIN import_tbl b
ON ( lower(a.f_name)=lower(b.f_name)
AND lower(a.l_name)=lower(b.l_name) )
) bar
)
ORDER BY user_id
表格人口示例:
current_tbl:
-------------------------------
user_id | f_name | l_name
---------+----------+----------
A1 | Adam | Acorn
A2 | Beth | Berry
A3 | Calv | Chard
| |
进口表格:
-------------------------------
user_id | f_name | l_name
---------+----------+----------
A1 | Adam | Acorn
A2 | Beth | Butcher <- last_name different
| |
预期输出:
-----------------------------------------------------------------------
user_id1 | f_name1 | l_name1 | user_id2 | f_name2 | l_name2
----------+-----------+-----------+------------+-----------+-----------
A1 | Adam | Acorn | A1 | Adam | Acorn
A2 | Beth | Berry | A2 | Beth | Butcher
A3 | Calv | Chard | | |
执行此方法可以消除该行所在的条件:
A2 | Beth | Berry | A2 | Beth | Butcher
但它保留了A3行
我希望这是有道理的,我没有过度简化它。这是我的另一个问题的延续问题。这些改进的连续性使查询从〜32000ms下降到现在的〜1200ms - 相当大的改进。
我怀疑我可以通过UNION ALL
在子查询中使用进行优化,当然还有通常的索引优化,但我正在寻找最好的 SQL 优化。仅供参考,这种特殊情况适用于 PostgreSQL。