4

考虑一个场景,我有 2 列(“A”列和“B”列)。

A 列有大约 130000 行/字符串 B 列有大约 10000 行/字符串

我想从“A”列中搜索“B”列的每个字符串。

如您所见,数据量非常大。我已经尝试过 Range.Find() 方法。但是需要很多时间才能完成。我正在寻找一种方法/方式,它可以让我的周转时间非常短。

*关于我的要求的更多说明*

(1) A & B 列包含字符串值,而不是数字。而且字符串可以很大

(2)对于“B”列中的每个单元格,“A”列中可以有很多次出现

(3) 我想用行号获取列“A”中所有出现的列“B”

(4) 对于列“B”中的字符串。它可以作为“A”列中任何单元格的子字符串找到


下载文件链接 - wikisend.com/download/431054/StackOverFlow_Sample.xlsx *

有什么建议么 ?

如果您需要任何额外的细节来解决上述问题,请随意!

4

2 回答 2

4

Try this.

This took 3 seconds for 130000 rows in Col A and 10000 rows in Col B. The output is generated in Col C.

NOTE: I have taken the worst case scenario where all 10000 values in Col B are present in Col A

This is how my data looks.

enter image description here

Sub Sample()
    Debug.Print Now

    Dim col As New Collection
    Dim ws As Worksheet
    Dim i As Long

    Set ws = ThisWorkbook.Sheets("Sheet1")

    Application.ScreenUpdating = False

    With ws
        .Range("C1:C10000").Value = "No"

        For i = 1 To 130000
            On Error Resume Next
            col.Add .Range("A" & i).Value, CStr(.Range("A" & i).Value)
            On Error GoTo 0
        Next i

        On Error Resume Next
        For i = 1 To 10000
            col.Add .Range("B" & i).Value, CStr(.Range("B" & i).Value)
            If Err.Number <> 0 Then .Range("C" & i).Value = "Yes"
            Err.Clear
        Next i
    End With

    Application.ScreenUpdating = True

    Debug.Print Now
End Sub

And this was the result

enter image description here

于 2013-11-07T16:44:26.703 回答
0

A 列 130000 个 100 个字符的字符串,B 列 10000 个 30 个字符的字符串,27 分钟。

C 列填充有 B 列字符串出现的行位置。D 列填充了 B 列字符串的出现次数。

Public Sub searchcells()
    Dim arrA(1 To 130000) As String, arrB(1 To 10000) As String, t As Date, nLen As Integer
    t = Now
    Me.Range("c:d") = ""

    For i = 1 To 130000
        arrA(i) = Me.Cells(i, 1)
    Next
    For i = 1 To 10000
        arrB(i) = Me.Cells(i, 2)
    Next

    For i = 1 To 130000
        nLen = Len(arrA(i))
        For j = 1 To 10000
            If InStrRev(arrA(i), arrB(j), nLen - Len(arrB(j)) + 1) > 0 Then Me.Cells(j, 4) = Me.Cells(j, 4) + 1: Me.Cells(j, 3) = Me.Cells(j, 3) & i & "; "
        Next
        Me.Cells(1, 5) = i
    Next

    Debug.Print CDbl(Now - t) * 24 * 3600 & " seconds"
End Sub

可以使用以下内容轻松填充单元格,更改每个部分中所需字符串数量和字符串长度的 i 和 j 限制。

Public Sub fillcells()
    Dim temp As String
    Randomize

    For i = 1 To 13000
        temp = ""
        For j = 1 To 100
            temp = temp & Chr(70 + Int(10 * Rnd()))
        Next
        Me.Cells(i, 1) = temp
    Next
    For i = 1 To 10000
        temp = ""
        For j = 1 To 30
            temp = temp & Chr(70 + Int(10 * Rnd()))
        Next
        Me.Cells(i, 2) = temp
    Next
End Sub

我无法在工作时下载您的电子表格,所以如果它错过了标记,请忽略它。

于 2013-11-08T12:35:12.723 回答