目前我有一个贯穿列表并删除重复值(在一列中)的宏,但事实证明它的效率非常低。对于它检查重复的每个条目,它必须遍历整个列;我的文件目前有 50,000 个条目,这不是一项小任务。
我认为宏工作的一种更简单的方法是让宏检查这个值是否在数组中。如果是,则删除条目所在的行。如果不是,则将值添加到数组中。
有人可以为宏的基本轮廓提供一些帮助吗?谢谢
下面的代码将遍历您的源数据并将其存储在一个数组中,同时检查重复项。收集完成后,它使用数组作为键来知道要删除哪些列。
由于删除后大量的电位屏幕更新,请务必关闭屏幕更新。(包括)
Sub Example()
Application.ScreenUpdating = false
Dim i As Long
Dim k As Long
Dim StorageArray() As String
Dim iLastRow As Long
iLastRow = ActiveSheet.Cells(ActiveSheet.Rows.Count, "A").End(xlUp).Row
ReDim StorageArray(1 To iLastRow, 0 To 1)
'loop through column from row 1 to the last row
For i = 1 To iLastRow
'add each sheet value to the first column of the array
StorageArray(i, 0) = ActiveSheet.Range("A" & i).Value
'- keep the second column as 0 by default
StorageArray(i, 1) = 0
'- as each item is added, loop through previously added items to see if its a duplicate
For k = 1 To i-1
If StorageArray(k, 0) = StorageArray(i, 0) Then
'if it is a duplicate set the second column of the srray to 1
StorageArray(i, 1) = 1
Exit For
End If
Next k
Next i
'loop through sheet backwords and delete rows that were maked for deletion
For i = iLastRow To 1 Step -1
If StorageArray(i, 1) = 1 Then
ActiveSheet.Range("A" & i).EntireRow.Delete
End If
Next i
Application.ScreenUpdating = true
End Sub
根据要求,这是一种类似的方法,使用集合而不是数组进行键索引:(RBarryYoung)
Public Sub RemovecolumnDuplicates()
Dim prev as Boolean
prev = Application.ScreenUpdating
Application.ScreenUpdating = false
Dim i As Long, k As Long
Dim v as Variant, sv as String
Dim cl as Range, ws As Worksheet
Set ws = ActiveWorksheet 'NOTE: This really should be a parameter ...
Dim StorageArray As New Collection
Dim iLastRow As Long
iLastRow = ws.Cells(ActiveSheet.Rows.Count, "A").End(xlUp).Row
'loop through column from row 1 to the last row
i = 1
For k = 1 To iLastRow
'add each sheet value to the collection
Set cl = ws.Cells(i, 1)
v = cl.Value
sv = Cstr(v)
On Error Resume Next
StorageArray.Add v, sv
If Err.Number <> 0 Then
'must be a duplicate, remove it
cl.EntireRow.Delete
'Note: our index doesn't change here, since all of the rows moved
Else
'not a duplicate, so go to the next row
i = i + 1
End If
Next k
Application.ScreenUpdating = prev
End Sub
请注意,此方法不需要为列中单元格的值假定任何数据类型或整数限制。
(Mea Culpa:我不得不在记事本中手动输入,因为我的 Excel 现在正忙于运行项目测试。所以可能存在一些拼写/语法错误......)
这是我的评论的后续行动。对于这样一个简单的操作,循环 50k 记录+循环数组将是一个过度杀戮。
就像我在评论中提到的那样,将数组中的值复制到新工作表中。然后在 50k 条目旁边插入一个空白列并执行Vlookup
or CountIf
。完成后,执行自动筛选,然后一次性删除重复的条目。让我们举个例子,看看它是如何工作的。
假设我们有一个包含 1000 个项目的数组?在一张纸中,我们有 50k 个数据。下面的代码将被测试1000 items in Array
并50k Data
查看快照
将此代码粘贴到模块中(代码完成时间少于 5 秒)
Sub Sample()
Dim ws As Worksheet, wstemp As Worksheet
Dim LRow As Long
Dim Ar(1 To 1000) As Long
Dim startTime As String, EndTime As String
startTime = Format(Now, "hh:mm:ss")
Set ws = Sheets("Sheet1")
Set wstemp = Sheets.Add
'~~> Creating a dummy array
For i = 1 To 1000
Ar(i) = i
Next i
'~~> Copy it to the new sheet
wstemp.Range("A1:A1000").Value = Application.Transpose(Ar)
With ws
LRow = .Range("A" & .Rows.Count).End(xlUp).Row
.Columns(2).Insert Shift:=xlToRight
.Range("B1").Value = "For Deletion"
.Range("B2:B" & LRow).FormulaR1C1 = "=COUNTIF(" & wstemp.Name & "!C[-1],RC[-1])"
.Columns(2).Value = .Columns(2).Value
'~~> Remove any filters
.AutoFilterMode = False
'~~> Filter, offset(to exclude headers) and delete visible rows
With .Range("B1:B" & LRow)
.AutoFilter Field:=1, Criteria1:="<>0"
.Offset(1, 0).SpecialCells(xlCellTypeVisible).EntireRow.Delete
End With
'~~> Remove any filters
.AutoFilterMode = False
.Columns(2).Delete
End With
EndTime = Format(Now, "hh:mm:ss")
MsgBox "The process started at " & startTime & " and finished at" & EndTime
End Sub
对于 Excel 2007 及更高版本:将数组复制到工作表并使用 removeduplicates 方法:
set ws = worksheets.add
ws.[A1].resize(ubound(yourarray,1),ubound(yourarray,2)).value = yourarray
ws.usedrange.removeduplicates columns:=1, header:=no
这假设您的数组的下限是 1,您要删除重复的列是第 1 列,并且您的列表没有标题。然后,您可以找到新范围的边界并将其读回您的数组(首先擦除当前数组)。
我建议填充您的列,然后使用公式查找重复项并删除它们。我没有你的实际代码(你没有给我们任何代码)
dim a as range
dim b as range
set a = Range ("A1")
Do while Not isEmpty(A)
Set b = a.offset(1,0)
If b = a then
b= ""
else a.offset (1,0)
Loop
我相信您可以在运行宏之前将过滤器放入代码中,或者只是记住过滤器。