regex - 在 Linux 中根据其内容输出文本块

Question

我有很多大型文本文件，它们按已知的分隔符 {} 进行分组。如果一个块包含某个序列，比如 xyq，那么我想输出整个块。

我知道我可以编写一个 grep 来获取搜索标签，但是如何将我的选择扩展到最近的括号？请注意，{ 和 } 可以位于任何位置，即不是行首或行尾、空格、...

寻找这样的东西：

Input:
 {i am a turtle}
 {i am a horse}
 {i am a programmer}

grep ???programmer??? ./File

output: {i am a programmer}

score 1 · Accepted Answer

您可以先尝试将换行符翻译成其他内容。假设输入没有 NUL，这是一个很好的候选。

cat input | tr '\n' '\0' | grep -aEo '\{.*?programmer.*?\}' | tr '\0' '\n'

在正则表达式本身中，?s 使先前的匹配成为非贪婪的，这意味着它们匹配最短的可能序列而不是最长的序列。请注意，如果搜索词可能出现在大括号之外，这将无法正常工作，您需要更加明确：

cat input | tr '\n' '\0' | grep -aEo '\{[^{}]*programmer[^{}]*\}' | tr '\0' '\n'

score 0 · Accepted Answer

>cat file
 {i am a turtle}
  jay   {i am a horse}
     {i am a programmer}



>grep horse file | awk -F"{}" '{print substr($2,0,length($2)-1)}'



 i am a horse

score 0 · Accepted Answer

sed -n '/{\|}/ !{H; b}; /{/ {h; b open}; :open {/}/ b close; n; H; b open}; :close {g; /programmer/ p}' File

解释：

$ sed -n '#suppress printing of all input
> /{\|}/ !{H; b} # if no curly brackets on the line, append it to hold space and finish
> /{/ {h; b open} # if an opening { is found, copy the line to hold space and branch to label :open
> :open
> /}/ b close # if a } is matched, branch to label close
> n; H; b open # else read a new line, append it to hold space and go back to :open
> :close
> g # put all hold space to pattern space
> /programmer/ p # if _programmer_ matches, print the pattern space' File

regex - 在 Linux 中根据其内容输出文本块

3 回答 3

Related

Reference