sed - sed 配方：如何在可以在一行或两行的两个模式之间做一些事情？

Question

假设我们只想在某些模式之间进行一些替换，让它们成为<a>并且</a>为了清楚起见......（好吧，好吧，它们是start和end！.. Jeez！）

所以我知道如果start并且end总是出现在同一行上该怎么办：只需设计一个适当的正则表达式。

我也知道如果它们保证在不同的行上该怎么办，并且我不关心包含行中的任何内容，end并且我也可以在包含start before start的行中应用所有命令：只需指定地址范围作为/start/,/end/.

然而，这听起来不是很有用。如果我需要做一个更聪明的工作，例如，在一个{...}块内引入更改，该怎么办？

我能想到的一件事是在处理之前{和}之前中断输入，然后将其重新组合在一起：

sed 's/{\|}/\n/g' input | sed 'main stuff' | sed ':a $!{N;ba}; s/\n\(}\|{\)\n/\1/g'

另一种选择是相反的：

cat input | tr '\n' '#' | sed 'whatever; s/#/\n/g'

这两者都很丑，主要是因为操作并不局限于单个命令。第二个甚至更糟，因为假设原始文本中不存在某些字符或子字符串，则必须使用它作为“换行符”。

那么问题来了：有没有更好的方法或者上述的方法可以优化吗？从我在最近的 SO 问题中读到的内容来看，这是一项相当常规的任务，所以我想一劳永逸地选择最佳实践。

PS我最感兴趣的是纯粹sed的解决方案：这项工作可以只通过一次调用来完成sed吗？请不要awk，Perl等：这更多是一个理论问题，而不是“需要尽快完成工作”的问题。

score 3 · Accepted Answer

这可能对您有用：

# create multiline test data
cat <<\! >/tmp/a
> this
> this { this needs
> changing to
> that } that
> that
> !
sed '/{/!b;:a;/}/!{$q;N;ba};h;s/[^{]*{//;s/}.*//;s/this\|that/\U&/g;x;G;s/{[^}]*}\([^\n]*\)\n\(.*\)/{\2}\1/' /tmp/a
this
this { THIS needs
changing to
THAT } that
that
# convert multiline test data to a single line
tr '\n' ' ' </tmp/a >/tmp/b
sed '/{/!b;:a;/}/!{$q;N;ba};h;s/[^{]*{//;s/}.*//;s/this\|that/\U&/g;x;G;s/{[^}]*}\([^\n]*\)\n\(.*\)/{\2}\1/' /tmp/b
this this { THIS needs changing to THAT } that that

解释：

将数据读入模式空间 (PS)。/{/!b;:a;/}/!{$q;N;ba}
将数据复制到保留空间 (HS)。h
从字符串的前后剥离非数据。s/[^{]*{//;s/}.*//
转换数据，例如s/this\|that/\U&/g
交换到 HS 并附加转换后的数据。x;G
用转换后的数据替换旧数据。s/{[^}]*}$[^\n]*$\n$.*$/{\2}\1/

编辑：

一个更复杂的答案，我认为每行可以满足一个以上的块。

# slurp file into pattern space (PS)
:a
$! {
N
ba
}
# check for presence of \v if so quit with exit value 1
/\v/q1
# replace original newlines with \v's
y/\n/\v/
# append a newline to PS as a delimiter
G
# copy PS to hold space (HS)
h
# starting from right to left delete everything but blocks
:b
s/\(.*\)\({.*}\).*\n/\1\n\2/
tb
# delete any non-block details form the start of the file
s/.*\n//
# PS contains only block details
# do any block processing here e.g. uppercase this and that
s/th\(is\|at\)/\U&/g
# append ps to hs
H
# swap to HS
x
# replace each original block with its processed one from right to left
:c
s/\(.*\){.*}\(.*\)\n\n\(.*\)\({.*}\)/\1\n\n\4\2\3/
tc
# delete newlines
s/\n//g
# restore original newlines
y/\v/\n/
# done!

注意这使用 GNU 特定的选项，但可以进行调整以使用通用 sed。

sed - sed 配方：如何在可以在一行或两行的两个模式之间做一些事情？

1 回答 1

Related

Reference