regex - 正则表达式查找两个分隔符之间最内部的字符串

Question

我正在使用TextCrawler *regxp* 来对齐现有的纯文本文件。文件内的文本是连续的，没有换行符。
....更多数据....

，演员名单：

Amy Brenneman, Aaron Eckhart, Catherine Keener, Natassja Kinski
, Jason Patric, Ben Stiller,

上映的电影：

Gladiator,Matrix Reloaded,The Shawshank Redemption,Pirates of the Caribbean 
- Curse of the Black Pearl,Monsters Inc,

流派：

SciFi,Romance,Drama,Action,Comedy,Advenure,Animated,Western,Horror

....更多数据....

我正在尝试查找逗号和冒号之间的字符串，并用相同的字符串替换，但在找到的模式之前添加了新行。我尝试了以下操作，但它匹配字符串形式的最外层逗号到冒号。

[,]{1}.[A-Z].*[:]

有同样的想法吗？我哪里出错了？

score 1 · Accepted Answer

为什么不使用这种模式：

search:   (?<=,)[^,:]+(?=:)
replace:  \n$0

图案细节：

(?<=,)  # lookbehind assertion: only a check that means "preceded by ,"
[^,:]+  # negated char class: all characters except , and :
(?=:)   # lookahead assertion: only a check that means "followed by :"

Lookarounds 只是可以使模式失败或成功的测试，它们不是匹配结果的一部分。

score 1 · Accepted Answer

下面提到的模式有效：

搜索模式：(,?[^:,]+:)
替换字符串：\n\1\n

例如：

给定一个包含内容的文件 a.txt：

演员名单：A、B、C，上映电影：D、E、F，流派：G、H、I

perl -pe "s@(,?[^:,]+:)@\n\1\n@g" a.txt

上面的命令产生以下格式的输出：

演员名单：
A、B、C
上映电影：
D、E、F
类型：
G、H、I

我希望上面的输出是你所期望的。

regex - 正则表达式查找两个分隔符之间最内部的字符串

2 回答 2

Related

Reference