24

我目前可以通过下面的代码上传数据然后处理表格,从而将 csv 文件数据输入 Excel VBA,这肯定不是最好的方法,因为我只对某些数据感兴趣并在使用数据后删除表格:

Sub CSV_Import() 
Dim ws As Worksheet, strFile As String 

Set ws = ActiveSheet 'set to current worksheet name 

strFile = Application.GetOpenFilename("Text Files (*.csv),*.csv", ,"Please select text file...") 

With ws.QueryTables.Add(Connection:="TEXT;" & strFile, Destination:=ws.Range("A1")) 
     .TextFileParseType = xlDelimited 
     .TextFileCommaDelimiter = True 
     .Refresh 
End With 
End Sub 

是否可以简单地将 csv 加载到 VBA 中的二维变量数组中,而不是通过使用 excel 工作表?

4

7 回答 7

23

好的,看起来您需要两件事:从文件中流式传输数据,并填充二维数组。

我有一个“Join2d”和一个“Split2d”函数(我记得不久前在 StackOverflow 上的另一个回复中发布了它们)。请查看代码中的注释,如果您正在处理大文件,您可能需要了解有关高效字符串处理的一些信息。

但是,它使用起来并不复杂:如果您赶时间,只需粘贴代码即可。

流式传输文件很简单,但我们对文件格式进行了假设:文件中的行是由回车字符还是回车和换行字符对分隔的?我假设'CR'而不是CRLF,但你需要检查一下。

关于格式的另一个假设是数字数据将按原样显示,而字符串或字符数据将被封装在引号中。这应该是正确的,但通常不是......并且去掉引号会增加很多处理 - 大量分配和释放字符串 - 你真的不想在一个大数组中做这些。我已经缩短了明显的逐个单元查找和替换的方法,但这仍然是大文件的问题。

如果您的文件在字符串值中嵌入了逗号,则此代码将不起作用:并且不要尝试编写一个解析器来挑选封装的文本并在将数据行拆分为单个字段时跳过这些嵌入的逗号,因为这种密集的字符串处理无法通过 VBA 优化为快速可靠的 csv 阅读器。

无论如何:这是源代码:注意 StackOverflow 的文本框控件插入的换行符:

运行代码:

请注意,您需要参考 Microsoft 脚本运行时 (system32\scrrun32.dll)

Private Sub test()
    Dim arrX As Variant
    arrX = ArrayFromCSVfile("MyFile.csv")
End Sub

流式传输 csv 文件。

请注意,我假设您的文件位于临时文件夹中:C:\Documents and Settings[$USERNAME]\Local Settings\Temp 您需要使用文件系统命令将文件复制到本地文件夹中:它总是比跨网络工作。

Public Function ArrayFromCSVfile( _
    strName As String, _
    Optional RowDelimiter As String = vbCr, _
    Optional FieldDelimiter = ",", _
    Optional RemoveQuotes As Boolean = True _
) As Variant

    ' Load a file created by FileToArray into a 2-dimensional array
    ' The file name is specified by strName, and it is exected to exist
    ' in the user's temporary folder. This is a deliberate restriction:
    ' it's always faster to copy remote files to a local drive than to
    ' edit them across the network

    ' RemoveQuotes=TRUE strips out the double-quote marks (Char 34) that
    ' encapsulate strings in most csv files.

    On Error Resume Next

    Dim objFSO As Scripting.FileSystemObject
    Dim arrData As Variant
    Dim strFile As String
    Dim strTemp As String

    Set objFSO = New Scripting.FileSystemObject
    strTemp = objFSO.GetSpecialFolder(Scripting.TemporaryFolder).ShortPath
    strFile = objFSO.BuildPath(strTemp, strName)
    If Not objFSO.FileExists(strFile) Then  ' raise an error?
        Exit Function
    End If

    Application.StatusBar = "Reading the file... (" & strName & ")"

    If Not RemoveQuotes Then
        arrData = Join2d(objFSO.OpenTextFile(strFile, ForReading).ReadAll, RowDelimiter, FieldDelimiter)
        Application.StatusBar = "Reading the file... Done"
    Else
        ' we have to do some allocation here...

        strTemp = objFSO.OpenTextFile(strFile, ForReading).ReadAll
        Application.StatusBar = "Reading the file... Done"

        Application.StatusBar = "Parsing the file..."

        strTemp = Replace$(strTemp, Chr(34) & RowDelimiter, RowDelimiter)
        strTemp = Replace$(strTemp, RowDelimiter & Chr(34), RowDelimiter)
        strTemp = Replace$(strTemp, Chr(34) & FieldDelimiter, FieldDelimiter)
        strTemp = Replace$(strTemp, FieldDelimiter & Chr(34), FieldDelimiter)

        If Right$(strTemp, Len(strTemp)) = Chr(34) Then
            strTemp = Left$(strTemp, Len(strTemp) - 1)
        End If

        If Left$(strTemp, 1) = Chr(34) Then
            strTemp = Right$(strTemp, Len(strTemp) - 1)
        End If

        Application.StatusBar = "Parsing the file... Done"
        arrData = Split2d(strTemp, RowDelimiter, FieldDelimiter)
        strTemp = ""
    End If

    Application.StatusBar = False

    Set objFSO = Nothing
    ArrayFromCSVfile = arrData
    Erase arrData
End Function

Split2d 从字符串创建二维 VBA 数组:

Public Function Split2d(ByRef strInput As String, _
    Optional RowDelimiter As String = vbCr, _
    Optional FieldDelimiter = vbTab, _
    Optional CoerceLowerBound As Long = 0 _
    ) As Variant

    ' Split up a string into a 2-dimensional array.

    ' Works like VBA.Strings.Split, for a 2-dimensional array.
    ' Check your lower bounds on return: never assume that any array in
    ' VBA is zero-based, even if you've set Option Base 0
    ' If in doubt, coerce the lower bounds to 0 or 1 by setting
    ' CoerceLowerBound
    ' Note that the default delimiters are those inserted into the
    '  string returned by ADODB.Recordset.GetString

    On Error Resume Next

    ' Coding note: we're not doing any string-handling in VBA.Strings -
    ' allocating, deallocating and (especially!) concatenating are SLOW.
    ' We're using the VBA Join & Split functions ONLY. The VBA Join,
    ' Split, & Replace functions are linked directly to fast (by VBA
    ' standards) functions in the native Windows code. Feel free to
    ' optimise further by declaring and using the Kernel string functions
    ' if you want to.

    ' ** THIS CODE IS IN THE PUBLIC DOMAIN **
    '    Nigel Heffernan   Excellerando.Blogspot.com

    Dim i   As Long
    Dim j   As Long

    Dim i_n As Long
    Dim j_n As Long

    Dim i_lBound As Long
    Dim i_uBound As Long
    Dim j_lBound As Long
    Dim j_uBound As Long

    Dim arrTemp1 As Variant
    Dim arrTemp2 As Variant

    arrTemp1 = Split(strInput, RowDelimiter)

    i_lBound = LBound(arrTemp1)
    i_uBound = UBound(arrTemp1)

    If VBA.LenB(arrTemp1(i_uBound)) <= 0 Then
        ' clip out empty last row: a common artifact in data
         'loaded from files with a terminating row delimiter
        i_uBound = i_uBound - 1
    End If

    i = i_lBound
    arrTemp2 = Split(arrTemp1(i), FieldDelimiter)

    j_lBound = LBound(arrTemp2)
    j_uBound = UBound(arrTemp2)

    If VBA.LenB(arrTemp2(j_uBound)) <= 0 Then
     ' ! potential error: first row with an empty last field...
        j_uBound = j_uBound - 1
    End If

    i_n = CoerceLowerBound - i_lBound
    j_n = CoerceLowerBound - j_lBound

    ReDim arrData(i_lBound + i_n To i_uBound + i_n, j_lBound + j_n To j_uBound + j_n)

    ' As we've got the first row already... populate it
    ' here, and start the main loop from lbound+1

    For j = j_lBound To j_uBound
        arrData(i_lBound + i_n, j + j_n) = arrTemp2(j)
    Next j

    For i = i_lBound + 1 To i_uBound Step 1

        arrTemp2 = Split(arrTemp1(i), FieldDelimiter)

        For j = j_lBound To j_uBound Step 1
            arrData(i + i_n, j + j_n) = arrTemp2(j)
        Next j

        Erase arrTemp2

    Next i

    Erase arrTemp1

    Application.StatusBar = False

    Split2d = arrData

End Function

Join2D 将二维 VBA 数组转换为字符串:

Public Function Join2d(ByRef InputArray As Variant, _
    Optional RowDelimiter As String = vbCr, _
    Optional FieldDelimiter = vbTab, _
    Optional SkipBlankRows As Boolean = False _
    ) As String

    ' Join up a 2-dimensional array into a string. Works like the standard
    '  VBA.Strings.Join, for a 2-dimensional array.
    ' Note that the default delimiters are those inserted into the string
    '  returned by ADODB.Recordset.GetString

    On Error Resume Next

    ' Coding note: we're not doing any string-handling in VBA.Strings -
    ' allocating, deallocating and (especially!) concatenating are SLOW.
    ' We're using the VBA Join & Split functions ONLY. The VBA Join,
    ' Split, & Replace functions are linked directly to fast (by VBA
    ' standards) functions in the native Windows code. Feel free to
    ' optimise further by declaring and using the Kernel string functions
    ' if you want to.

    ' ** THIS CODE IS IN THE PUBLIC DOMAIN **
    '   Nigel Heffernan   Excellerando.Blogspot.com

    Dim i As Long
    Dim j As Long

    Dim i_lBound As Long
    Dim i_uBound As Long
    Dim j_lBound As Long
    Dim j_uBound As Long

    Dim arrTemp1() As String
    Dim arrTemp2() As String

    Dim strBlankRow As String

    i_lBound = LBound(InputArray, 1)
    i_uBound = UBound(InputArray, 1)

    j_lBound = LBound(InputArray, 2)
    j_uBound = UBound(InputArray, 2)

    ReDim arrTemp1(i_lBound To i_uBound)
    ReDim arrTemp2(j_lBound To j_uBound)

    For i = i_lBound To i_uBound

        For j = j_lBound To j_uBound
            arrTemp2(j) = InputArray(i, j)
        Next j

        arrTemp1(i) = Join(arrTemp2, FieldDelimiter)

    Next i

    If SkipBlankRows Then

        If Len(FieldDelimiter) = 1 Then
            strBlankRow = String(j_uBound - j_lBound, FieldDelimiter)
        Else
            For j = j_lBound To j_uBound
                strBlankRow = strBlankRow & FieldDelimiter
            Next j
        End If

        Join2d = Replace(Join(arrTemp1, RowDelimiter), strBlankRow, RowDelimiter, "")
        i = Len(strBlankRow & RowDelimiter)

        If Left(Join2d, i) = strBlankRow & RowDelimiter Then
            Mid$(Join2d, 1, i) = ""
        End If

    Else

        Join2d = Join(arrTemp1, RowDelimiter)

    End If

    Erase arrTemp1

End Function

分享和享受。

于 2012-09-05T14:39:34.000 回答
13

是的,将其作为文本文件读取。

看这个例子

Option Explicit

Sub Sample()
    Dim MyData As String, strData() As String

    Open "C:\MyFile.CSV" For Binary As #1
    MyData = Space$(LOF(1))
    Get #1, , MyData
    Close #1
    strData() = Split(MyData, vbCrLf)
End Sub

跟进

就像我在下面的评论中提到的,AFAIK,没有直接的方法可以从 csv 填充二维数组。您将不得不使用我上面给出的代码,然后将其按行拆分,最后填充一个可能很麻烦的二维数组。填充一列很容易,但如果您特别想从第 5 行到第 7 列数据说,那么它会变得很麻烦,因为您必须检查数据中是否有足够的列/行。这是在 2D 数组中获取 Col B 的基本示例。

注意:我没有做任何错误处理。我相信你可以解决这个问题。

假设我们的 CSV 文件看起来像这样。

在此处输入图像描述

当您运行此代码时

Option Explicit

Const Delim As String = ","

Sub Sample()
    Dim MyData As String, strData() As String, TmpAr() As String
    Dim TwoDArray() As String
    Dim i As Long, n As Long

    Open "C:\Users\Siddharth Rout\Desktop\Sample.CSV" For Binary As #1
    MyData = Space$(LOF(1))
    Get #1, , MyData
    Close #1
    strData() = Split(MyData, vbCrLf)

    n = 0

    For i = LBound(strData) To UBound(strData)
        If Len(Trim(strData(i))) <> 0 Then
            TmpAr = Split(strData(i), Delim)
            n = n + 1
            ReDim Preserve TwoDArray(1, 1 To n)
            '~~> TmpAr(1) : 1 for Col B, 0 would be A
            TwoDArray(1, n) = TmpAr(1)
        End If
    Next i

    For i = 1 To n
        Debug.Print TwoDArray(1, i)
    Next i
End Sub

您将获得如下所示的输出

在此处输入图像描述

顺便说一句,我很好奇,既然你在 Excel 中这样做,为什么不使用内置Workbooks.OpenQueryTables方法,然后将范围读入二维数组?那会简单很多...

于 2012-09-04T08:39:00.050 回答
10

好的,经过调查,我得到的解决方案是使用 ADODB(需要引用 ActiveX 数据对象,这会将 csv 文件加载到数组中而不循环行列。确实需要数据处于良好状态。

Sub LoadCSVtoArray()

strPath = ThisWorkbook.Path & "\"

Set cn = CreateObject("ADODB.Connection")
strcon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=""text;HDR=Yes;FMT=Delimited"";"
cn.Open strcon
strSQL = "SELECT * FROM SAMPLE.csv;"

Dim rs As Recordset
Dim rsARR() As Variant

Set rs = cn.Execute(strSQL)
rsARR = WorksheetFunction.Transpose(rs.GetRows)
rs.Close
Set cn = Nothing

[a1].Resize(UBound(rsARR), UBound(Application.Transpose(rsARR))) = rsARR

End Sub
于 2012-09-14T15:05:55.930 回答
2

为了将已知格式的 csv 数据文件转换为 2D 数组,我最终采用了以下方法,该方法似乎效果很好,速度也很快。我决定现在文件读取操作相当快,所以我对 csv 文件运行了第一次,以获得数组的两个维度所需的大小。使用适当尺寸的数组,然后逐行重新读取文件并填充数组是一项简单的任务。

Function ImportTestData(ByRef srcFile As String, _
                        ByRef dataArr As Variant) _
                        As Boolean

Dim FSO As FileSystemObject, Fo As TextStream
Dim line As String, Arr As Variant
Dim lc As Long, cc As Long
Dim i As Long, j As Long

ImportTestData = False
Set FSO = CreateObject("Scripting.FilesystemObject")
Set Fo = FSO.OpenTextFile(srcFile)

' First pass; read the file to get array size
lc = 0 ' Counter for number of lines in the file
cc = 0 ' Counter for number of columns in the file
While Not Fo.AtEndOfStream  ' Read the csv file line by line
    line = Fo.ReadLine
    If lc = 0 Then ' Count commas to get array's 2nd dim index
        cc = 1 + Len(line) - Len(Replace(line, ",", ""))
    End If
    lc = lc + 1
Wend
Fo.Close

' Set array dimensions to accept file contents
ReDim dataArr(0 To lc - 1, 0 To cc - 1)
'Debug.Print "CSV has "; n; " rows with "; lc; " fields/row"
If lc > 1 And cc > 1 Then
    ImportTestData = True
End If

' Second pass; Re-open data file and copy to array
Set Fo = FSO.OpenTextFile(srcFile)
lc = 0
While Not Fo.AtEndOfStream
    line = Fo.ReadLine
    Arr = Split(line, ",")
    For i = 0 To UBound(Arr)
        dataArr(lc, i) = Arr(i)
    Next i
    lc = lc + 1
Wend

End Function   'ImportTestData()

如果需要,我将其创建为 Function 而不是 Sub 以获得简单的返回值。读取包含 8,500 行 20 列的文件大约需要 180 毫秒。
此方法假定 CSV 文件的结构(分隔符的数量)对于每一行都是相同的,这是数据记录应用程序的典型特征。

于 2019-05-20T15:20:38.687 回答
0

或者,您可以使用这样的代码

Dim line As String, Arr
Dim FSO As Object, Fo As Object
Set FSO = CreateObject("Scripting.FileSystemObject")
Set Fo = FSO.OpenTextFile("csvfile.csv")
While Not Fo.AtEndOfStream
 line = Fo.ReadLine      ' Read the csv file line by line
 Arr = Split(line, ",")  ' The csv line is loaded into the Arr as an array
 For i = 0 To UBound(Arr) - 1: Debug.Print Arr(i) & " ";: Next
 Debug.Print
Wend

 01/01/2019 1 1 1 36 55.6 0.8 85.3 95 95 109 102 97 6 2.5 2.5 3.9 
 01/01/2019 1 2 0 24 0.0 2.5 72.1 89 0 0 97 95 10 6.7 4.9 3.9 
 01/01/2019 1 3 1 36 26.3 4 80.6 92 92 101 97 97 8 5.5 5.3 3.7 
 01/01/2019 1 4 0 16 30.0 8 79.2 75 74 87 87 86 10 3.8 4 4.2 
于 2017-09-16T13:21:19.617 回答
0

以下解决方案不使用 ActiveX:

我编写了将 csv(实际上是制表符分隔)文件导入数组的代码。该代码如下。

首先让我们指定数组(最初它是完全无效的,但稍后会适当调整大小):

Dim TxtFile$()

现在为子程序:

' Fills TxtFile$() array
Sub FillTextFileArray(A$)

'***********************************************************************
' Declarations
'***********************************************************************
Dim I, J As Integer
Dim LineString As String
'***********************************************************************

I = -1: J = 0    ' Will hold array dimentions

Open A$ For Input As #1

Do While Not EOF(1)    ' Loop until end of file.
    Line Input #1, LineString
    LineString = LineString + vbTab    ' If not done empty lines give error with Split()
    I = I + 1
    If J < UBound(Split(LineString, vbTab)) Then J = UBound(Split(LineString, vbTab))
Loop

ReDim TxtFile$(1 To I + 4, 1 To J + 4)    ' Not indexed from 0 ! (Plus some room at the end.) This is done to match worksheet format.
Seek #1, 1    ' Reset to start

I = -1    ' Will hold array row index
Do While Not EOF(1)    ' Loop until end of file.
    Line Input #1, LineString
    LineString = LineString + vbTab    ' If not done empty lines give error with Split()
    I = I + 1
    For J = 0 To UBound(Split(LineString, vbTab))
        TxtFile$(I + 1, J + 1) = Split(LineString, vbTab)(J)
    Next J
Loop

Close #1    ' Close file.

' TxtFile$() now holds the contents of the text file

End Sub

显然你可以用 TxtFile$ 数组做你想做的事。A$ 是文本文件的位置和名称。如前所述,此特定代码适用于制表符分隔的文件(vbTab),而不是逗号分隔(分隔),但任何适应都不会太困难。它具有避免 ActiveX 并发症的优点。

于 2019-08-05T21:01:39.247 回答
0

这些天来,GitHub 托管了至少三个 CSV 解析器,它们完全按照 OP 的要求执行 - 将 CSV 文件加载到 VBA 数组中。

我是这个的作者:
https ://github.com/PGS62/VBA-CSV

它处理各种各样的 CSV 文件,包括那些带有“嵌入式”逗号、换行符等的文件,以及每行具有不同数量字段的文件。我在 README 文件中提供了指向替代 VBA CSV 解析器的链接。

于 2021-09-27T13:56:04.737 回答