2

我正在处理来自大型机的字母数字数据。由于访问点的性质,GetString 方法在网络浏览器界面中用于从大型机中提取数据。我正在重构我的代码以及旧代码以使用数据结构而不仅仅是范围对象,因为范围对象代码在大型数据集上花费的时间要长得多。

作为一般优化实践的一部分,我运行所有大型数据集宏Application.ScreenUpdating = FalseApplication.Calculation = xlCalculationManual激活。为了计时,在将 Counter 与状态栏结合使用后,我将QueryPerformanceCounter与 DoEvents 一起使用,以便它为我提供完成特定宏所需的时间。QueryPerformanceCounter 位于类模块中,在执行我的代码的域逻辑/业务逻辑中没有直接作用。

例如,我最近重构了从大型机屏幕中提取大约 10,000 个字符串并通过循环将它们放入工作表的代码。当重构为数据结构循环时,代码在将字符串放入数组时大约需要 70 秒。代码也更便携,因为这些字符串可以很容易地转移/放置到字典中进行排序或集合中进行解析。因此,我将所有 VBA 代码从基于范围的代码切换到数据结构,这是我的问题的引入/背景。

我在一个分析项目中遇到了一些旧代码,这些代码有一些有趣的逻辑来从大型机中提取内容。本质上,代码以这种布局形式从服务器拉取内容:

从服务器提取的原始数据到 Excel 表

然后使用 Worksheet/Cell 逻辑作为框架将内容解析为 Excel 表单中的此表单:

从服务器解析到 Excel 工作表的数据

没有登录/访问逻辑以及没有子程序声明的代码如下:

Sub AcquireData()

    CurrentServerRow = 13

    WhileLoopHolder = 1

    If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) <> "" Then

        NewWorksheetLine_Sub

    End If

    Do While WhileLoopHolder = 1

        If CurrentSession.Screen.Getstring(CurrentServerRow, 9, 1) = "-" Then

            If Trim(CurrentSession.Screen.Getstring(CurrentServerRow + 1, 15, 1)) <> "" Then

                NewWorksheetLine_Sub

            End If

        ElseIf Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) = "" Then

            If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14)) <> "" Then
                Cells(WorksheetRow, ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
                ValueSets = ValueSets + 1
            End If

        Else

            If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then

                Cells(WorksheetRow, WorksheetColumn) = "X"

            Else

                Cells(WorksheetRow, WorksheetColumn) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)

            End If

            Cells(WorksheetRow, WorksheetColumn + 1) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)
            Cells(WorksheetRow, WorksheetColumn + 2) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))
            Cells(WorksheetRow, ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
            WorksheetColumn = WorksheetColumn + 3
            ValueSets = ValueSets + 1

        End If

        CurrentServerRow = CurrentServerRow + 1

        If CurrentServerRow > 41 Then

            WhileLoopHolder = 0

        End If

    Loop

End Sub

Sub NewWorksheetLine_Sub()

        WorksheetRow = WorksheetRow + 1
        WorksheetColumn = 1
        ValueSets = 10

End Sub

这段代码嵌套在另一个程序的循环中,从而拉出数千行并整齐地组织它们。它还需要数小时并浪费宝贵的时间,这些时间可用于分析从服务器获取的数据。我设法将基本代码重构为数据结构,并利用我的学习重构其他代码。不幸的是,我错误地重构了这段代码,因为我无法正确地模仿业务逻辑。我的片段如下:

Sub AcquireData()
'This code refactors the data into a datastructure from a range object, but does not really capture the logic.
'Also, There is an error in attempting to insert a variant array into a collection/dictionary data structure.


CurrentServerRow = 13

ReDim SourceDataArray(10)

WhileLoopHolder = 1

If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) <> "" Then

    NewWorksheetLine_Sub

End If

Do While WhileLoopHolder = 1

    If CurrentSession.Screen.Getstring(CurrentServerRow, 9, 1) = "-" Then

        If Trim(CurrentSession.Screen.Getstring(CurrentServerRow + 1, 15, 1)) <> "" Then

            NewWorksheetLine_Sub

        End If

    ElseIf Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) = "" Then

        If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14)) <> "" Then

            ReDim Preserve SourceDataArray(ValueSets)
            SourceDataArray(ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))

            ValueSets = ValueSets + 1
            ReDim Preserve SourceDataArray(ValueSets)
        End If

    Else

        If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then

            ReDim Preserve SourceDataArray(WorkSheetColumn)
            SourceDataArray(WorkSheetColumn) = "X"

        Else

            SourceDataArray(WorkSheetColumn) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)

        End If

        SourceDataArray(WorkSheetColumn + 1) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)

        SourceDataArray(WorkSheetColumn + 2) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))

        SourceDataArray(ValueSets) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))

        WorkSheetColumn = WorkSheetColumn + 3
        ValueSets = ValueSets + 1
        ReDim Preserve SourceDataArray(ValueSets)

    End If

    CurrentServerRow = CurrentServerRow + 1

    If CurrentServerRow > 41 Then

        WhileLoopHolder = 0

    End If

Loop

End Sub

Sub NewWorksheetLine_Sub()

SourceIndexAsString = SourceCollectionIndex

   SourceDataCollection.Add SourceDataArray(), SourceIndexAsString

    SourceCollectionIndex = SourceCollectionIndex + 1
    WorkSheetColumn = 1
    ValueSets = 10

End Sub

我考虑过,为了使用相同类型的“单元”逻辑,我可能想使用嵌套在数组中的数组,然后将其转置到工作表中。然而,在过去的几周里,我迄今未能成功实施任何此类解决方案。此外,可能存在将逻辑重构为数据结构形式的更好方法。但是,我一直无法确定如何成功地做到这一点。

总而言之,我的问题如下:我可以通过什么方式将基于“单元”的逻辑转换为数据结构逻辑?这样做的最佳数据结构是什么?在这种特殊情况下,如何使用此业务逻辑实现数据结构逻辑的使用?

4

3 回答 3

1

的一些使用ReDim Preserve似乎有问题。

If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then
  ReDim Preserve SourceDataArray(WorkSheetColumn)
  SourceDataArray(WorkSheetColumn) = "X"

因此,如果WorksheetColumn有该值1,我们将SourceDataArray在大小上减少为一个条目,并丢弃数组中较高位置的所有数据。

Else
  SourceDataArray(WorkSheetColumn) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)
End If

SourceDataArray(WorkSheetColumn + 1) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)

SourceDataArray(WorkSheetColumn + 2) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))

现在我们可能正在查看SourceDataArray不存在的条目(即,当If遵循上面的分支而不是Else分支时)并且我们应该得到“下标超出范围”错误

ReDim Preserve只保留对新数组大小有意义的数组元素的数据。因此,如果我们有ReDim a(10)然后以后有ReDim Preserve a(5)(并假设数组从元素 0 开始 - 即 no Option Base 1),那么a(5)a(9)现在都无法访问并且它们包含的数据会丢失

于 2012-09-06T22:21:23.417 回答
1

要将使用单元格引用的代码重构为数组,您需要使用二维数组
单元格引用是基于 1 的,因此您也应该在数组中坚持这一点。

Range.Value您可以使用属性将范围复制到数组和从数组中复制

' Range to array
Dim a as Variant
a = Range("A1:J100").Value

将导致a成为大小的变体数组1 To 100, 1 To 10

' Array to Range
Dim a(1 To 100, 1 To 10) as Variant
' populate a
' ...
' Put a into a range
Range("A1:J100").Value = a

这两个代码片段产生相同的输出,但第二个运行更快

Dim r as Long, c as Long
For r = 1 To 1000
For c = 1 To 100
    Cells(r, c) = r * c
Next c, r


Dim r as Long, c as Long
Dim a() as Variant 
Redim a(1 To 1000, 1 To 100)   
For r = 1 To 1000
For c = 1 To 100
    a(r, c) = r * c
Next c, r
Range("A1:CV1000") = a

ReDim Preserve是一项相对昂贵的操作,因此ReDim分块更快

而不是这个

Redim a(1 To 10, 1 To 1)
For 1 = 1 to 100000
    Redim Preserve a(1 To 10, 1 To i)
    a(i) = SomeValue
Next

改为这样做

Redim a(1 To 10, 1 To 1000)
For 1 = 1 to 100000
    If i > UBound(a) Then
        Redim Preserve a(1 To 10, 1 To UBound(a) + 1000)
    End If
    a(i) = SomeValue
Next
Redim Preserve a (1 To 10, 1 To i - 1)

Redim Preserve只能改变多维数组的最后一维。

例如,这行得通

Redim a(1 to 100, 1 To 10)
Redim Preserve a(1 to 100, 1 To 20)

这不起作用

Redim a(1 to 100, 1 To 10)
Redim Preserve a(1 to 200, 1 To 20)

通常在使用表示范围的数组时,其行数变化最大。这提出了一个问题,因为Range.Value数组是(1 To Rows, 1 To Columns)

一种解决方法是实际标注您的数组(1 To Columns, 1 To Rows)Redim根据需要的行数,然后Transpose进入目标范围

Dim r As Long, c As Long
Dim a() As Variant
ReDim a(1 To 100, 1 To 200)
For r = 1 To 1000
For c = 1 To 100
    If r > UBound(a, 2) Then
        ReDim Preserve a(1 To UBound(a, 1), 1 To UBound(a, 2) + 200)
    End If
    a(c, r) = r * c
Next c, r
Range("A1:CV1000") = Application.Transpose(a)

如果您需要更改两个维度,则更改第一个维度将需要创建一个所需大小的新数组并将数据从旧数组复制到新数组。再一次,像这样大块地重新调整以避免过多的重新调整

最后一件事:您似乎没有Dim变量(除非您刚刚将这部分从您的帖子中删除)。我建议您使用Option ExplicitDim所有变量。这有助于避免数据类型错误,也可以避免使用Variant一切。 Variants当你需要时很好,但当你不需要时,其他数据类型通常更快。

于 2012-09-07T03:15:45.757 回答
0

有一次我花了几周时间将其他宏从基于范围的逻辑重构为抽象的数据结构逻辑,当我回到这个宏时,我得到了答案。如果我只是模仿范围逻辑以便更快地完成宏,那么我只需要填充数组,使其在转置后与范围匹配。这意味着我不需要修剪数组或以任何方式操纵它的形式——我只需要以数组形式填充数据结构,然后将其转置到电子表格中。一旦数组被填满,我也可以替代使用数据。

这是解决方案代码:

Sub AcquireData()

'The array 'MyArray' was dimensioned as a dynamic array in the declarations section at the top of the module.
'Redim the array to a big 2 dimensional array that fits the needs of the data/macro.
ReDim MyArray(1 To 20, 1 To 20000)

'From here on, simply mimic the logic of the range macro... [i]
CurrentServerRow = 13

WhileLoopHolder = 1

If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) <> "" Then

    NewWorksheetLine_Sub

End If

Do While WhileLoopHolder = 1

    If CurrentSession.Screen.Getstring(CurrentServerRow, 9, 1) = "-" Then

        If Trim(CurrentSession.Screen.Getstring(CurrentServerRow + 1, 15, 1)) <> "" Then

            NewWorksheetLine_Sub

        End If

    ElseIf Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)) = "" Then

        If Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14)) <> "" Then

            '[i] ... except, move the values into the array in Column, Row logic form.
            MyArray(ValueSets, WorksheetRow) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
            ValueSets = ValueSets + 1
        End If

    Else

        If CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1) = "" Then

            MyArray(WorksheetColumn, WorksheetRow) = "X"

        Else

            MyArray(WorksheetColumn, WorksheetRow) = CurrentSession.Screen.Getstring(CurrentServerRow, 5, 1)

        End If

        MyArray(WorksheetColumn + 1, WorksheetRow) = CurrentSession.Screen.Getstring(CurrentServerRow, 9, 7)
        MyArray(WorksheetColumn + 2, WorksheetRow) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 17, 39))
        MyArray(ValueSets, WorksheetRow) = Trim(CurrentSession.Screen.Getstring(CurrentServerRow, 58, 14))
        WorksheetColumn = WorksheetColumn + 3
        ValueSets = ValueSets + 1

    End If

    CurrentServerRow = CurrentServerRow + 1

    If CurrentServerRow > 41 Then

        WhileLoopHolder = 0

    End If

Loop

ArrayToWorkSheet_Sub

End Sub

Sub NewWorksheetLine_Sub()

    WorksheetRow = WorksheetRow + 1
    WorksheetColumn = 1
    ValueSets = 10

End Sub

'When finished with the loop, push the array to the worksheet, and transpose it to provide the correct column to row relationship in the spreadsheet.
Sub ArrayToWorkSheet_Sub()

Dim ArrayLimit As Long

Dim LastCell As Long

Dim MyRange As Range

'This level of precision in setting the range appears unnecessary, but in theory I think it could speed up tranposing the array - [ii]
'[ii]but that is just speculation. Performance improvements for the tranposition appear to be minor, perhaps due to the fact that [iii]
'[iii]most - if not nearly all - of the intense computations occur earlier.
With Sheets("Sheet2")

ArrayLimit = UBound(MyArray, 2)

LastCell = ArrayLimit + 1

Set MyRange = .Range("A2:S" & LastCell)

MyRange = WorksheetFunction.Transpose(MyArray)

End With

End Sub

虽然两者在减少宏运行时间Application.ScreenUpdating = False方面Application.Calculation = xlCalculationManual都非常宝贵,但我在将这两行与抽象数据结构的使用结合起来方面获得了非常积极的经验。在某些情况下,数据结构似乎有助于优化性能,尤其是在宏观过程中涉及大量逐行数据提取的情况下。

于 2012-09-18T16:53:39.527 回答