2

我尝试编写一个 Excel VBA 脚本,该脚本从二进制 FrameMaker 文件 (*.fm) 中获取一些信息(版本和修订日期)。

下面的 sub 打开 *.fm 文件并将前 25 行(所需的信息在前 25 行中)写入一个变量。

Sub fetchDate()
    Dim fso As Object
    Dim fmFile As Object

    Dim fileString As String
    Dim fileName As String
    Dim matchPattern As String
    Dim result As String
    Dim i As Integer
    Dim bufferString As String

    Set fso = CreateObject("Scripting.FileSystemObject")

    fileName = "C:\FrameMaker-file.fm"

    Set fmFile = fso.OpenTextFile(fileName, ForReading, False, TristateFalse)
    matchPattern = "Version - Date.+?(\d{1,2})[\s\S]*Rev.+?(\d{1,2})"

    fileString = ""
    i = 1
    Do While i <= 25
        bufferString = fmFile.ReadLine
        fileString = fileString & bufferString & vbNewLine
        i = i + 1
    Loop
    fmFile.Close

    'fileString = Replace(fileString, matchPattern, "")
    result = regExSearch(fileString, matchPattern)

    MsgBox result

    Set fso = Nothing
    Set fmFile = Nothing
End Sub

正则表达式函数如下所示:

Function regExSearch(ByVal strInput As String, ByVal strPattern As String) As String
    Dim regEx As New RegExp

    Dim strReplace As String
    Dim result As String
    Dim match As Variant
    Dim matches As Variant
    Dim subMatch As Variant

    Set regEx = CreateObject("VBScript.RegExp")

    If strPattern <> "" Then
        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .Pattern = strPattern
        End With

        If regEx.test(strInput) Then
            Set matches = regEx.Execute(strPattern)

            For Each match In matches
                If match.SubMatches.Count > 0 Then
                    For Each subMatch In match.SubMatches
                        Debug.Print "match:" & subMatch
                    Next subMatch
                End If
            Next match

            regExSearch = result
        Else
            regExSearch = "no match"
        End If
    End If

    Set regEx = Nothing
End Function

问题1:

保存在变量“fileString”中的二进制 *.fm 文件的内容在每次运行时都不同,尽管 *.fm 文件保持不变。

以下是来自不同运行的前三行的几个示例,它们保存在“fileString”中:

示例 1

<MakerFile 12.0>


Aaÿ No.009.xxx  ????          /tEXt     ??????

示例 2

<MakerFile 12.0>


Aaÿ  `      ? ????          /tEXt ?     c ? E     ? ????a A ? ?      ? ? ? d??????? ?        Heading ????????????A???????A

如您所见,示例 1 与示例 2 不同,尽管它是相同的 VBA 代码和相同的 *.fm 文件。

问题2:

来自“matchPattern”的正则表达式搜索字符串随机写入我的“fileString”也是一个大问题。这是调试控制台的屏幕截图:

matchPattern 的部分值

怎么会这样?有什么建议或想法来解决这个问题吗?

我在用着:

微软办公专业增强版 2010

正则表达式的 VBA 参考:Microsoft VBScript 正则表达式 5.5

非常感谢您!

问候,安迪

/编辑 2018 年 3 月 12 日:

这是一个示例 *.fm 文件:示例文件 如果您使用记事本打开它,您可以看到一些信息,例如“Version - DateVersion 4 – 2018/Feb/07”和“Rev02 - 2018/Feb/21”文本。我想用正则表达式获取这些信息。

4

2 回答 2

1

我找到了使用 ADODB.streams 的解决方案。这工作正常:

Sub test_binary()
    Dim regEx As Object

    Dim buffer As String
    Dim filename As String
    Dim matchPattern As String
    Dim result As String

    Set regEx = CreateObject("VBScript.RegExp")

    filename = "C:\test.fm"

    With CreateObject("ADODB.Stream")
        .Open
        .Type = 2
        .Charset = "utf-8"
        .LoadFromFile filename
        buffer = .Readtext(10000)
        .Close
    End With

    matchPattern = "Version - Date.+?(\d{1,2})[\s\S]*Rev.+?(\d{1,2})"

    result = regExSearch(buffer, matchPattern)

    MsgBox result
End Sub

正则表达式函数:

Function regExSearch(ByVal strInput As String, ByVal strPattern As String) As String
    Dim regEx As New RegExp

    Dim result As String
    Dim match As Variant
    Dim matches As Variant
    Dim subMatch As Variant

    Set regEx = CreateObject("VBScript.RegExp")

    If strPattern <> "" Then
        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .Pattern = strPattern
        End With

        If regEx.test(strInput) Then
            Set matches = regEx.Execute(strInput)

            result = ""
            For Each match In matches
                If match.SubMatches.Count > 0 Then
                    For Each subMatch In match.SubMatches
                        If Len(result) > 0 Then
                            result = result & "||"
                        End If
                        result = result & subMatch
                    Next subMatch
                End If
            Next match

            regExSearch = result
        Else
            regExSearch = "err_nomatch"
        End If
    End If

    Set regEx = Nothing
End Function

将 *.fm 文件作为文本文件 (.Type = 2) 打开并将字符集设置为“utf-8”非常重要。否则我不会有纯文本供我的正则表达式阅读。

非常感谢您带我走上正确的道路!

于 2018-03-12T10:02:55.150 回答
0

只需将 FM 文件保存为 MIF。它是 FM 文件的文本编码,可以来回转换而不会丢失信息。

于 2019-03-27T09:29:48.883 回答