0

I am using MS SQL 2008R2. I am working with a partial data set (i.e. not all colunms) and have not been able to exclude duplicate rows. For example, here are the rows for a claim:

Claim       Line Doctor Received    Paid
02195111345 1    22924  1995-11-13  2001-10-02  
02195111345 1    22924  1995-11-13  2001-10-02  
02195111345 2    22924  1995-11-13  2001-10-02  
02195111345 2    22924  1995-11-13  2001-10-02  
02195111345 3    22924  1995-11-13  2001-10-02  
02195111345 3    22924  1995-11-13  2001-10-02  
02195111345 4    22924  1995-11-13  2001-10-02  
02195111345 4    22924  1995-11-13  2001-10-02  
02195111345 5    22924  1995-11-13  2001-10-02  
02195111345 5    22924  1995-11-13  2001-10-02  

I want to select only 1 row for each Claim+Line. There are additional columns, but they contain the same information for each combination of Claim+Line.

I can easily identify the rows with duplicates with a query such as:

SELECT [ClaimNum],[ClaimNum_Line]
FROM [dbo].[DamagedClaims]
GROUP BY [ClaimNum],[ClaimNum_Line]
HAVING COUNT(*) > 1 

Typically, I would create a derived table using this query and join it back to the table - with another coulmn such as MAX(Version) or such, but there are no such columns with which to differentiate the duplicates.

I have considered

SELECT ClaimNum], Line, MAX(DOCTOR), MAX([Date Received]), MAX([Date Paid])

but there are about 20 columns and 750 million rows and that seemed like a lot of overhead (I am very sympathetic to our SQL Servers!). Is there a better solution?

Best, Scott

4

3 回答 3

2

just add DISTINCT in your SELECT

SELECT DISTINCT Claim, Line, Doctor, Received, Paid
FROM tableName
于 2012-09-04T13:35:45.663 回答
2

I see no column that can be the basis of you maximum row, try using DISTINCT

SELECT DISTINCT *
FROM tableA
于 2012-09-04T14:09:07.633 回答
1

Using SELECT DISTINCT - http://www.w3schools.com/sql/sql_distinct.asp

于 2012-09-04T13:36:26.583 回答