这是一个在 A1 和 B1 中查找 Text1 和 Text2 的宏。然后,它允许您选择要处理的文件并解析从 text1 到 text2 的子字符串。最后,它将它们拆分为不超过 1024 个字符的块(确保每个块以空格结尾,以免拆分单词),并将它们写入从 A2 开始的 A 列中的一系列行。
子字符串的解析以及将它们分解为 1024 个字符块都是使用正则表达式完成的。“工作”是在 VBA 数组中完成的,因为这比来回工作表要快。
由于字符串变量的长度可能约为 2^31 个字符,我怀疑您将整个文档读入单个变量然后处理它会有任何问题,而不是逐行处理。
由于宏有参数,您需要从另一个宏中调用它;或者更改代码以允许 text1 和 text2 的不同输入方法对您来说应该是微不足道的。
没有错误检查。
如果您不想在结果中包含 Text1 和 Text2,则只需对正则表达式模式进行细微更改即可。
我使用早期绑定以便在编写宏时利用“提示”。这需要按照宏中的说明设置引用。但是,如果您愿意,将其更改为后期绑定应该很简单。
您还可以考虑进行修改,以便多行块以某种方式与单行块区分开来。
享受
Option Explicit
'Set Reference to Microsoft Scripting Runtime
'Set Reference ot Microsoft VBScript Regular Expressions 5.5
Sub ExtractPhrases(Text1 As String, Text2 As String)
Dim FSO As FileSystemObject
Dim TS As TextStream
Dim FN As File, sFN As String
Dim RE As RegExp, MC As MatchCollection, M As Match
Dim RE2 As RegExp, MC2 As MatchCollection, M2 As Match
Dim sPat As String
Dim S As String, sTemp As String
Dim V() As Variant, vRes() As Variant
Dim I As Long, J As Long, K As Long
Dim C As Range
Dim rRes As Range
'Get File path
With Application.FileDialog(msoFileDialogFilePicker)
.AllowMultiSelect = False
.ButtonName = "Process File"
.Filters.Add "Text", "*.txt", 1
.FilterIndex = 1
.InitialView = msoFileDialogViewDetails
If .Show = -1 Then sFN = .SelectedItems(1)
End With
'Read File into String variable
Set FSO = New FileSystemObject
Set TS = FSO.OpenTextFile(FileName:=sFN, IOMode:=ForReading, Create:=False)
S = TS.ReadAll
'Get results
Set RE = New RegExp
Set RE2 = New RegExp
With RE2
.Global = True
.MultiLine = False
.Pattern = "(\S[\s\S]{1,1023})(?:\s+|$)"
End With
With RE
.Global = True
.IgnoreCase = True
.Pattern = "\b" & Text1 & "\b([\s\S]+?)\b" & Text2 & "\b"
If .Test(S) = True Then
ReDim vRes(0)
Set MC = RE.Execute(S)
For I = 1 To MC.Count
Set MC2 = RE2.Execute(MC(I - 1))
ReDim V(1 To MC2.Count)
For J = 1 To MC2.Count
V(J) = MC2(J - 1).SubMatches(0)
Next J
ReDim Preserve vRes(UBound(vRes) + J - 1)
For J = 1 To MC2.Count
K = K + 1
vRes(K) = V(J)
Next J
Next I
End If
End With
vRes(0) = "Phrases"
'transpose vRes
ReDim V(1 To UBound(vRes) + 1, 1 To 1)
For I = 0 To UBound(vRes)
V(I + 1, 1) = vRes(I)
Next I
Set rRes = Range("a2").Resize(rowsize:=UBound(V))
Range(rRes(1), Cells(Rows.Count, rRes.Column)).Clear
rRes = V
End Sub