1

这个问题类似于我在这里问的另一个问题:Match strings between delimiting characters但我无法修改以执行新任务。(解决方案应该适用于 EmEditor 或 Notepad++)

我需要匹配特定标签之间的文本,即<b class="b2">I have a lot of text, more text, some more text, text</b>然后

  1. 仅在打开标签后将第一个字符转换为小写(代词“I”除外)
  2. 将逗号之间的内容转换为维基链接(并消除标签)。

我已经尝试运行许多正则表达式来通过多个步骤来接近这一点,即

(<b class="b2">)(.)
[[\L\2

</b>
]]

(\[\[)(\w+), (\w+)(\]\])
\1\2]], [[\3\4

输入文本:

Any text <b class="b2">I make laugh</b>: Ar. and P. γέλωτα. Some more text <b class="b2">Delight</b>: P. and V. [[τέρπω]].
Any text <b class="b2">I amuse oneself, pass the time</b>: P. διάγειν.
Any text <b class="b2">It amuses oneself with, pass the time over, amuse</b>: Ar. and P.

预期输出:

Any text [[I make laugh]]: Ar. and P. γέλωτα. Some more text [[delight]]: P. and V. [[τέρπω]].
Any text [[I amuse oneself]], [[pass the time]]: P. διάγειν.
Any text [[it amuses oneself with]], [[pass the time over]], [[amuse]]: Ar. and P.
4

2 回答 2

1

这是一步解决方案:

  • Ctrl+H
  • 找什么:(?:<b class="b2">|\G(, (?=.*</b>)))(I )?([^,<]+)(?:</b>)?
  • 用。。。来代替:$1[[$2\l$3]]
  • 检查环绕
  • 检查正则表达式
  • 取消选中. matches newline
  • Replace all

解释:

(?:                 # non capture group
    <b class="b2">  # literally
  |                 # OR
    \G              # restart from last match position
    (               # group 1, a comma and a space
      ,             # a comma and a space
    (?=.*</b>)      # positive look ahead, make sure we have a closing tag after
    )               # end group 1
)                   # end group
(I )?               # group 2, UPPER I and a space, optional
([^,<]+)            # group 3, 1 or more any character that is not comma or less than
(?:</b>)?           # optional end tag

替代品:

$1          # content og group 1 (i.e. comma & space)
[[          # double openning square bracket
$2          # content of group 2, (i.e. "I ")
\l$3        # lowercase the first letter of group 3 (i.e. all character until comma or end tag)
]]          # double closing square bracket

给定示例的结果:

Any text [[I make laugh]]: Ar. and P. γέλωτα. Some more text [[delight]]: P. and V. [[τέρπω]].
Any text [[I amuse oneself]], [[pass the time]]: P. διάγειν.
Any text [[it amuses oneself with]], [[pass the time over]], [[amuse]]: Ar. and P.
[[be at ease]], v.: P. and V. ἡσυχάζειν, V. ἡσύχως ἔχειν.

屏幕截图:

在此处输入图像描述

于 2019-09-25T11:56:32.067 回答
0

你应该分几个步骤来做。


代替

<b class="b2">([^<]*)</b>

[[\1]]

<b>将标签转换为维基链接。


代替

(\[\[[^,\[\]]*?)(\s*),(\s*)

\1]], [[

将文本标记为维基链接。但是,它可能需要运行多次才能替换所有逗号。见这里


代替

\[\[([A-Z])

[[\l\1

确保在 NPPP 中选择“匹配大小写”。

这会将所有大写字母转换[[为小写字母。


代替

\[\[i(\s)

[[I\1

转换以恢复大写开头的代词。

于 2019-09-25T08:53:27.070 回答