linux - 使用 'sed' 或其他类似命令来捕获一个组，然后只输出该数据

Question

我有一个如下所示的日志文件：

 sdfsdf
 sdfsdf<Pay>1234</Pay> sdfsdfsdf
 sdfsdf<Pay>12342323</Pay> sdfsdfsdf
 sdfsdf

...我只想打印出来：

1234
12342323

我正在考虑使用'sed'并具有以下行：

sed 's/<Pay>(*)<\/Pay>/\1/g' abc.txt

但我得到了错误：

sed: -e 表达式 #1, char 22: 's' 命令的 RHS 上的无效引用 \1

我怎样才能达到预期的输出？

这是最新的 Ubuntu Linux bash。

score 4 · Accepted Answer

4

sed -n 's/.*<Pay>\(.*\)<\/Pay>.*/\1/p' file

于 2013-09-19T22:25:29.107 回答

score 2 · Accepted Answer

2

完美的案例grep -o：

grep -oP '(?<=<Pay>).+?(?=</Pay>)'

于 2013-09-19T23:16:17.010 回答

score 0 · Accepted Answer

sed，与 Perl 不同，需要对其捕获括号进行转义：\(.*\)

太获得您的预期输出，然后您需要摆脱该行的其余部分。只需将其包含在模式中即可。

score 0 · Accepted Answer

使用awk（仅gawk或mawk由于 RS 中的正则表达式）

awk 'NR%2==0' RS="</?Pay>" file
1234
12342323

4 回答 4