c# - 我需要修改一个 Word MERGEFIELD 正则表达式

Question

我正在使用这个库在我的应用程序中实现 Word 文档邮件合并：http: //www.codeproject.com/Articles/38575/Fill-Mergefields-in-docx-Documents-without-Microso

它工作得很好，但我已经大量重构了代码并执行了其他任务，以便将它与我自己的应用程序集成。

该库使用此正则表达式来捕获 Word 邮件合并字段：

private static readonly Regex _instructionRegEx = new Regex(
    @"^[\s]*MERGEFIELD[\s]+(?<name>[#\w]*){1}             # This retrieves the field's name (Named Capture Group -> name)
       [\s]*(\\\*[\s]+(?<Format>[\w]*){1})?               # Retrieves field's format flag (Named Capture Group -> Format)
       [\s]*(\\b[\s]+[""]?(?<PreText>[^\\]*){1})?         # Retrieves text to display before field data (Named Capture Group -> PreText)
       [\s]*(\\f[\s]+[""]?(?<PostText>[^\\]*){1})?        # Retrieves text to display after field data (Named Capture Group -> PostText)",
    RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.ExplicitCapture | RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline
);

这捕获了示例，MERGEFIELD FieldNameGoesHere但是我遇到了字段名称被双引号括起来的示例，MERGEFIELD "FieldNameGoesHere"但是正则表达式没有捕获这些示例。

如您所见，正则表达式有点硬核，超出了我当前的 regex-fu 来修改它以使用双引号但也接受未引用的 MERGEFIELD。

显然第一行需要修改，但我不确定如何准确修改它。

score 1 · Accepted Answer

更新：将双引号移到命名组的外部。

在您的第一行中，替换(?<name>[#\w]*)为"?(?<name>[#\w]*)"?让"?RegEx 查找可选的双引号。

score 0 · Accepted Answer

^[\s]*MERGEFIELD[\s]+"?(?<name>[#\w]*){1}"?

如果字段名称包含空格，则不起作用：MERGEFIELD“我的字段名称”。

可以使用：

MERGEFIELD\s+"(.*?)"

或者

MERGEFIELD\s+([#\w]+)

c# - 我需要修改一个 Word MERGEFIELD 正则表达式

2 回答 2

Related

Reference