我有一个数据文件,我正在尝试使用 VBScript 正则表达式对象进行验证。
数据:
01THAILAND 000004
08 000004 05
正则表达式模式: ^01.{15}[0-9]{6}|^08 [0-9]{6} [0-9]{2}.
如果它与第一行具有相同的代码“000004”,我如何设置我的模式以将 08 开头的行解析为有效?这两项之间存在其他数据线,代码不会总是'000004'!唯一保持特定的是 2 个字符的行标识符和格式。
我有一个数据文件,我正在尝试使用 VBScript 正则表达式对象进行验证。
数据:
01THAILAND 000004
08 000004 05
正则表达式模式: ^01.{15}[0-9]{6}|^08 [0-9]{6} [0-9]{2}.
如果它与第一行具有相同的代码“000004”,我如何设置我的模式以将 08 开头的行解析为有效?这两项之间存在其他数据线,代码不会总是'000004'!唯一保持特定的是 2 个字符的行标识符和格式。
Pure regexes won't cut it, but that's probably not hat you're using anyway.
The widespread naming for this sort of thing is called "backreferencing", and allows you to include part of the regex currently being matched against inside the pattern itself. The usual syntax, inherited from sed
, is \1
to reference the first capturing parentheses of the regex.
So in your example it'd look something like:
^01.{15}\([0-9]{6}\)
.*
^08 \1 [0-9]{2}.
Do note that you're not matching for single lines any more, but for the whole group. (To match single lines, you'd really need to remember the original code and include it explicitly in your terminating regex.) So you'll need to make sure your regex engine is capable of multiline matching.
You can use \n
as a backreference where n is the index of the captured group. Demo:
str = "01THAILAND 000004" & vbNewLine & "08 000004 05"
Set re = new regexp
re.Pattern = "\d+\w+ +(\d+)\s+\d+ \1 \d+" ' \1 is the back reference
re.Global = true
msgbox re.Test(str)
Ninja edit: your pattern would be something like ^01.{15}([0-9]{6})\s{1,2}08 \1 [0-9]{2}