1

使用 SQL Server 2008 并将两个文件导入到一个表中。第一个文件 (2048) 有 6,721 行,第二个文件有 (2209) 4,707 行,列是:Billed, FirstName, LastName, FileID. 表称为Claims

需要查询以列出每个FileId(2209 和 2048)显示每个文件中的重复项,并从其中一个文件中删除重复项。

运行此查询:

SELECT firstname
, lastname
, duplicatecount = COUNT(1)
FROM Claims
WHERE fileid IN (2209, 2048)
GROUP BY
firstname
, lastname
HAVING COUNT(1) > 1
ORDER BY COUNT(1) DESC
4

3 回答 3

0

这将为您提供每个文件中的重复项,

SELECT firstname , lastname , count(*) as duplicatecount 
FROM Claims WHERE fileid IN (2209, 2048) 
GROUP BY firstname , lastname HAVING COUNT(*) > 1 
ORDER BY 1,2 DESC

试试这个

于 2012-12-29T14:43:37.420 回答
0

你可以尝试这样的事情:

WITH counted_and_marked AS (
  SELECT
    *
    rnk = ROW_NUMBER() OVER (PARTITION BY firstname, lastname ORDER BY fileid)
  FROM Claims
  WHERE fileid IN (2209, 2048)
)
DELETE FROM marked_and_counted
WHERE rnk > 1
;

公用表表达式只是从中marked_and_counted检索所有行,按 的顺序Claims独立排列每个的重复项。然后,DELETE 语句只删除排名大于 1 的行。(firstname, lastname)fileid

您可以看到 DELETE 直接以 CTE 为目标,这在本例中是允许的,因为 CTE 仅引用一个表。

此查询适用于任意数量的文件。它将删除所有重复项,每个(firstname, lastname).

于 2012-12-29T15:19:23.503 回答
0

这些是重复的。所以从你的查询开始:

with todelete as (<your query here>)
delete from Claims
    from todelete
    where todelete.firstname = claims.firstname and
          todelete.lastname = claims.lastname and
          fileid = 2209

您要删除重复值,而不是全部,因此您需要指定要删除的值。我随意选择了2209。

于 2012-12-29T15:44:51.520 回答