从数据表中删除重复条目的最佳方法是什么?
11 回答
dtEmp
对当前工作的 DataTable执行以下操作:
DataTable distinctTable = dtEmp.DefaultView.ToTable( /*distinct*/ true);
这真好。
删除重复项
public DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
Hashtable hTable = new Hashtable();
ArrayList duplicateList = new ArrayList();
//Add list of all the unique item value to hashtable, which stores combination of key, value pair.
//And add duplicate item value in arraylist.
foreach (DataRow drow in dTable.Rows)
{
if (hTable.Contains(drow[colName]))
duplicateList.Add(drow);
else
hTable.Add(drow[colName], string.Empty);
}
//Removing a list of duplicate items from datatable.
foreach (DataRow dRow in duplicateList)
dTable.Rows.Remove(dRow);
//Datatable which contains unique records will be return as output.
return dTable;
}
这里下面的链接
http://www.dotnetspider.com/resources/4535-Remove-duplicate-records-from-table.aspx
http://www.dotnetspark.com/kb/94-remove-duplicate-rows-value-from-datatable.aspx
用于删除列中的重复项
http://dotnetguts.blogspot.com/2007/02/removing-duplicate-records-from.html
一个简单的方法是:
var newDt= dt.AsEnumerable()
.GroupBy(x => x.Field<int>("ColumnName"))
.Select(y => y.First())
.CopyToDataTable();
这篇文章是关于基于多列从数据表中仅获取不同的行。
Public coid removeDuplicatesRows(DataTable dt)
{
DataTable uniqueCols = dt.DefaultView.ToTable(true, "RNORFQNo", "ManufacturerPartNo", "RNORFQId", "ItemId", "RNONo", "Quantity", "NSNNo", "UOMName", "MOQ", "ItemDescription");
}
您需要调用此方法,并且需要为数据表赋值。在上面的代码中,我们有 RNORFQNo、PartNo、RFQ id、ItemId、RNONo、Quantity、NSNNO、UOMName、MOQ 和 Item Description 作为我们想要不同值的列。
这是一种简单快捷的使用方法AsEnumerable().Distinct()
private DataTable RemoveDuplicatesRecords(DataTable dt)
{
//Returns just 5 unique rows
var UniqueRows = dt.AsEnumerable().Distinct(DataRowComparer.Default);
DataTable dt2 = UniqueRows.CopyToDataTable();
return dt2;
}
有一种使用 Linq GroupBy 方法的简单方法。
var duplicateValues = dt.AsEnumerable()
.GroupBy(row => row[0])
.Where(group => (group.Count() == 1 || group.Count() > 1))
.Select(g => g.Key);
foreach (var d in duplicateValues)
Console.WriteLine(d);
/* To eliminate Duplicate rows */
private void RemoveDuplicates(DataTable dt)
{
if (dt.Rows.Count > 0)
{
for (int i = dt.Rows.Count - 1; i >= 0; i--)
{
if (i == 0)
{
break;
}
for (int j = i - 1; j >= 0; j--)
{
if (Convert.ToInt32(dt.Rows[i]["ID"]) == Convert.ToInt32(dt.Rows[j]["ID"]) && dt.Rows[i]["Name"].ToString() == dt.Rows[j]["Name"].ToString())
{
dt.Rows[i].Delete();
break;
}
}
}
dt.AcceptChanges();
}
}
完全不同的行:
public static DataTable Dictinct(this dt) => dt.DefaultView.ToTable(true);
由特定行区分(请注意,“distinctClumnNames”中提到的列将在结果数据表中返回):
public static DataTable Dictinct(this dt, params string[] distinctColumnNames) =>
dt.DefaultView.ToTable(true, distinctColumnNames);
由特定列区分(保留给定 DataTable 中的所有列):
public static void Distinct(this DataTable dataTable, string distinctColumnName)
{
var distinctResult = new DataTable();
distinctResult.Merge(
.GroupBy(row => row.Field<object>(distinctColumnName))
.Select(group => group.First())
.CopyToDataTable()
);
if (distinctResult.DefaultView.Count < dataTable.DefaultView.Count)
{
dataTable.Clear();
dataTable.Merge(distinctResult);
dataTable.AcceptChanges();
}
}
我更喜欢这个,因为它比 DefaultView.ToTable 和 foreach 循环删除重复项更快。使用它,我们也可以在多个列上进行分组。
DataTable distinctDT = (from rows in dt.AsEnumerable()
group rows by new { ColA = rows["ColA"], ColB = rows["ColB"]} into grp
select grp.First()).CopyToDataTable();
您可以使用 DataTable 的 DefaultView.ToTable 方法进行这样的过滤(适应 C#):
Public Sub RemoveDuplicateRows(ByRef rDataTable As DataTable)
Dim pNewDataTable As DataTable
Dim pCurrentRowCopy As DataRow
Dim pColumnList As New List(Of String)
Dim pColumn As DataColumn
'Build column list
For Each pColumn In rDataTable.Columns
pColumnList.Add(pColumn.ColumnName)
Next
'Filter by all columns
pNewDataTable = rDataTable.DefaultView.ToTable(True, pColumnList.ToArray)
rDataTable = rDataTable.Clone
'Import rows into original table structure
For Each pCurrentRowCopy In pNewDataTable.Rows
rDataTable.ImportRow(pCurrentRowCopy)
Next
End Sub
为了区分所有数据表列,您可以轻松检索字符串数组中的列名
public static DataTable RemoveDuplicateRows(this DataTable dataTable)
{
List<string> columnNames = new List<string>();
foreach (DataColumn col in dataTable.Columns)
{
columnNames.Add(col.ColumnName);
}
return dataTable.DefaultView.ToTable(true, columnNames.Select(c => c.ToString()).ToArray());
}
如您所见,我想将其用作 DataTable 类的扩展