我需要一个数据库中未在另一个数据库中列出的用户列表new_user_id
。两个数据库中有 112,815 个匹配用户;user_id
是所有查询表中的键。
查询 #1 有效,并为我提供了 111,327 个未被引用为 new_user_Id 的用户。但它需要两次查询相同的数据。
-- 111,327 GSU users are NOT listed as a CSS new user
-- 1,488 GSU users ARE listed as a new user in CSS
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (select cud.new_user_id
from css.user_desc cud
where cud.new_user_id is not null);
查询#2 将是完美的......实际上我很惊讶它在语法上被接受。但这给了我一个毫无意义的结果。
-- This gives me 1,505 users... I've checked, and they are not
-- referenced as new_user_ids in CSS, but I don't know why the ones
-- that were excluded were excluded.
--
-- Where are the missing 109,822, and whatexcluded them?
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (cudsubq.new_user_id);
第二个查询中的 where 子句到底在做什么,为什么它从结果中排除 109,822 条记录?
注意 上面的查询是我真正追求的简化。还有其他/更好的方法可以进行上述查询......它们只是代表给我带来问题的查询部分。