1

我已经阅读了其他几个关于如何使用 UTF8 编码(无 BOM)将表导出到 .csv 的答案。我找到了几乎对我有用的代码,见下文。

我的问题是该表包含瑞典字符 (ÅÄÖ),当打开 .csv 文件时,这些字符会丢失为看起来不正确的字符集。我找到了一种解决方法,即在记事本中打开 .csv 文件,保存,然后在 Excel 中打开它。解决方法使 Excel 正确显示字母,但我不希望有额外的步骤。可以修改下面的代码以使字符集不丢失吗?

Option Explicit

Sub CSVFileAsUTF8WithoutBOM()
Dim SrcRange As Range
Dim CurrRow As Range
Dim CurrCell As Range
Dim CurrTextStr As String
Dim ListSep As String
Dim FName As Variant
Dim UTFStream As Object
Dim BinaryStream As Object

' ADO Constants
Const adTypeBinary = 1 ' The stream contains binary data
Const adTypeText = 2 ' The stream contains text data (default)
Const adWriteLine = 1 ' write text string and a line separator (as defined by the LineSeparator property) to the stream.
Const adModeReadWrite = 3 ' Read/write
Const adLF = 10 ' Line feed only - default is carriage return line feed (adCRLF)
Const adSaveCreateOverWrite = 2 ' Overwrites the file with the data from the currently open Stream object, if the file already exists

' Open this workbook location
ChDrive Left(ThisWorkbook.Path, 1)
ChDir ThisWorkbook.Path

' ask for file name and path
  FName = Application.GetSaveAsFilename("", "CSV File (*.csv), *.csv")

' prepare UTF-8 stream
  Set UTFStream = CreateObject("adodb.stream")
  UTFStream.Type = adTypeText
  UTFStream.Mode = adModeReadWrite
  UTFStream.Charset = "UTF-8"
  UTFStream.LineSeparator = adLF
  UTFStream.Open

  'set field separator
  ListSep = ";"
  'set source range with data for csv file
  If Selection.Cells.Count > 1 Then
    Set SrcRange = Selection
  Else
    Set SrcRange = ActiveSheet.UsedRange
  End If

  For Each CurrRow In SrcRange.Rows
    CurrTextStr = ""
    For Each CurrCell In CurrRow.Cells
      CurrTextStr = CurrTextStr & Replace(CurrCell.Value, """", """""") & ListSep
    Next
    'remove ListSep after the last value in line
    While Right(CurrTextStr, 1) = ListSep
      CurrTextStr = Left(CurrTextStr, Len(CurrTextStr) - 1)
    Wend
    'add line to UTFStream
    UTFStream.WriteText CurrTextStr, adWriteLine ' Writes character data to a text Stream object
  Next

  'skip BOM
  UTFStream.Position = 3 ' sets or returns a long value that indicates the current position (in bytes) from the beginning of a Stream object

  'copy UTFStream to BinaryStream
  Set BinaryStream = CreateObject("adodb.stream")
  BinaryStream.Type = adTypeBinary
  BinaryStream.Mode = adModeReadWrite
  BinaryStream.Open ' Opens a Stream object

  'Strips BOM (first 3 bytes)
  UTFStream.CopyTo BinaryStream ' Copies a specified number of characters/bytes from one Stream object into another Stream object

  UTFStream.Flush ' Sends the contents of the Stream buffer to the associated underlying object
  UTFStream.Close ' Closes a Stream object

  'save to file
  BinaryStream.SaveToFile FName, adSaveCreateOverWrite
  BinaryStream.Flush ' Sends the contents of the Stream buffer to the associated underlying object
  BinaryStream.Close ' Closes a Stream object

End Sub
4

1 回答 1

1

编辑:

根据您的评论,我意识到您最初想要的是在没有BOM的情况下将有关字符编码的信息保留在文件中。

这个问题的问题(正如您所意识到的那样)是 BOM 实际上通常包含有关字符编码的信息,并且将这些信息放在文件中的任何其他位置并没有真正意义。

因此,您的代码实际上非常适合手头的任务。需要更改的是您要使用的软件如何导入/打开 CSV 文件。

当文件没有 BOM 时,读取文件的软件必须猜测字符编码。

一般来说,如果您使用的软件不支持 BOM 并且无法正确猜测,那么至少应该有一种方法可以自定义导入/打开命令的行为,以便您可以指定字符编码(看起来像您实际上找到了)。

原答案:

出于某种原因,当您双击文件打开 UTF-8 编码的 CSV 文件时,Excel 很难猜测字符编码。你必须帮助它一点...

如果 Excel 无法自行识别,您可以使用(旧版)文本导入向导并在导入期间选择 UTF-8 字符集 (65001)将 CSV 内容加载到新工作簿,而不是直接打开它。

如果您要在执行宏时录制宏并将其制成子过程,则可能会出现以下情况:

Sub OpenCSV(FullFileName As String)

    Dim wb As Workbook
    Set wb = Workbooks.Add
    
    Dim ws As Worksheet
    Set ws = wb.Sheets(1)

    With ws.QueryTables.Add(Connection:= _
        "TEXT;" & FullFileName, Destination:=Range( _
        "$A$1"))
        .Name = "CSV_Open"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .TextFilePromptOnRefresh = False
        .TextFilePlatform = 65001
        .TextFileStartRow = 1
        .TextFileParseType = xlDelimited
        .TextFileTextQualifier = xlTextQualifierDoubleQuote
        .TextFileConsecutiveDelimiter = False
        .TextFileTabDelimiter = True
        .TextFileSemicolonDelimiter = False
        .TextFileCommaDelimiter = False
        .TextFileSpaceDelimiter = False
        .TextFileColumnDataTypes = Array(1)
        .TextFileTrailingMinusNumbers = True
        .Refresh BackgroundQuery:=False
    End With
    
End Sub

其他建议

如果您确实希望能够双击文件而不是使用文本导入向导或运行宏,您总是可以在每次打开工作簿时运行的加载项或 PERSONAL.XSLB 中创建 VBA 事件过程。

如果它检测到刚刚打开的文件是 CSV 文件,它可以关闭它并使用上面的代码“重新打开”它。

额外: 有趣:这里有一个关于如何更改 Excel 使用的默认字符编码的问题。

于 2020-08-03T18:23:05.180 回答