我想在 DataTable 中查找所有行,其中每组列都是重复的。我目前的想法是获取多次出现的所有行的索引列表,如下所示:
public List<int> findDuplicates_New()
{
string[] duplicateCheckFields = { "Name", "City" };
List<int> duplicates = new List<int>();
List<string> rowStrs = new List<string>();
string rowStr;
//convert each datarow to a delimited string and add it to list rowStrs
foreach (DataRow dr in submissionsList.Rows)
{
rowStr = string.Empty;
foreach (DataColumn dc in submissionsList.Columns)
{
//only use the duplicateCheckFields in the string
if (duplicateCheckFields.Contains(dc.ColumnName))
{
rowStr += dr[dc].ToString() + "|";
}
}
rowStrs.Add(rowStr);
}
//count how many of each row string are in the list
//add the string's index (which will match the row's index)
//to the duplicates list if more than 1
for (int c = 0; c < rowStrs.Count; c++)
{
if (rowStrs.Count(str => str == rowStrs[c]) > 1)
{
duplicates.Add(c);
}
}
return duplicates;
}
但是,这不是很有效:遍历字符串列表并获取每个字符串的计数是 O(n^2)。我查看了这个解决方案,但不知道如何将它用于超过 1 个字段。我正在寻找一种更便宜的方法来处理这个问题。