4

Is there a way to find duplicate pairs in multiple columns in Excel?

For instance,

column1       column2  

Smith         Jones       <-- duplicate pair  
Smith         Johnson  
Jones         Smith       <-- duplicate pair  
Jones         Walter  
4

3 回答 3

4

The way I would do it would be as follows: (Assuming your data begins in cell A1 & B1)

  • In C1, put in the formula: =A1&"|"&B1
  • In D1, put in the formula: =B1&"|"&A1
  • In E1, put in the formula: =MATCH(C1,$D$1:$D$500,0)
  • Drag all these down for your dataset (and change the $500 in your final formula if necessary)
  • Any place where column E has a number, it means the data is duplicated - You could even filter for anything that isn't #N/A and it tells you where in the list the duplicate match is.

Hope this helps!!

于 2013-05-29T20:57:55.483 回答
2

If you want a single column to indicate duplicates of this type, you can use this formula:

=SUMPRODUCT((CONCATENATE($A$1:$A$500,"^",$B$1:$B$500)=CONCATENATE(B1,"^",A1))*1)

This will give you the number of duplicates of the type you have in your table. You can apply a filter on all the values from there which are non-zero. Just change the range of $A$1:$A$500 and $B$1:$B$500 if need be. It will depend on the size of your table.

Or if, by any change you want to use conditional formatting, you can select columns A and B, click on 'Conditional Formatting', 'New Rule', 'Use formula to determine which cells to format', paste in the formula:

=SUMPRODUCT((CONCATENATE($A$1:$A$500,"^",$B$1:$B$500)=CONCATENATE(B1,"^",A1))*1)>0

then pick a formatting from the button 'Format...' (I usually go for 'Fill' and pick yellow) which will highlight the duplicates in column B.

于 2013-05-30T06:24:17.597 回答
1

I'd first create a column by concatenating the sorted values of Column's 1 and 2 (and a space in the middle). Then I'd just look for duplicates in that column. The following should do it:

  1. In C1, =IF(A1<=B1,A1 & " | " & B1,B1 & " | " & A1)
  2. In D1, =COUNTIF($C$1:$C$999,C1)>1

Then just copy cells C1, and D1, all the way down the spreadsheet. Column D should contain TRUE for any duplicate pairs.

Edit: Changed delimeter (still not foolproof, but better)

于 2013-05-29T21:27:54.937 回答