0

示例.xml:

` id 1 的 lvl3 测试点 id 1 的 lvl4

<tester>
   <level1 id="2"> test point </level1>
   <level2> </level2>
   <level3>lvl3 of id 2 </level3>
   <level4> lvl4 of id 2</level4>
   <level5> </level5>
</tester>

<tester>
   <level1 id="3"> test point </level1>
   <level2> </level2>
   <level3>lvl3 of id 3</level3>
   <level4>lvl4 of id 3</level4>
   <level5> </level5>
</tester>

<tester>
   <level1 id="2"> test point </level1>
   <level2> </level2>
   <level3>lvl3 of id 2 2nd occurance</level3>
   <level4>lvl4 of id 2 2nd occurance</level4>
   <level5> </level5>
</tester>

`对于上面提到的sample.xml,只有在level1 中的Id为2
时,我才需要获取level3 和level4 标记。例如:当我搜索id=2时,我应该得到以下答案

<level3>lvl3 of id 2 </level3>
<level4> lvl4 of id 2</level4>

<level3>lvl3 of id 2 2nd occurance</level3>
<level4>lvl4 of id 2 2nd occurance</level4>
4

3 回答 3

2

使用 sed:

sed -n '/<tester>/{n;/<level1[ ]*id="2"/{n;n;N;p}}' input

解释:

sed                  # execute sed
-n                   # do not print unless explicitly stated
/<tester>/           # if this line contains <tester>
{                    # then 
n;                   # skip the line (read new line over the old line)
/<level1[ ]*id="2"/  # if this line contains <level1 [spaces] id="2"
{                    # then
n;n;                 # skip it, and skip the next line
N;                   # read another line but this time append
p                    # print the buffer
}                    # end if
}                    # end if
于 2013-02-04T08:30:29.157 回答
0

我会推荐一个像xmlstarlet这样的 xml 解析器。但是,这并不是说不能使用awk. 这是一种方法。像这样运行:

awk -f script.awk file

内容script.awk

/<tester>/ {
    r=""
    f=1
}

f && /<level1 id="2">/ {
    g=1
}

g && /<level[34]>/ {
    sub(/^[ \t]+/, "")
    r = r $0 ORS
}

/<\/tester>/ {
    if (g && r) {
        print r
    }
    f=g=0
}

结果:

<level3>lvl3 of id 2 </level3>
<level4> lvl4 of id 2</level4>

<level3>lvl3 of id 2 2nd occurance</level3>
<level4>lvl4 of id 2 2nd occurance</level4>

或者,这是单线:

awk '/<tester>/ { r=""; f=1 } f && /<level1 id="2">/ { g=1 } g && /<level[34]>/ { sub(/^[ \t]+/, ""); r = r $0 ORS } /<\/tester>/ { if (g && r) print r; f=g=0 }' file
于 2013-02-04T08:30:26.980 回答
0

在 awk 中使用块时,清除 RS 通常很方便。我相信这可以满足您的要求:

awk '/id="2"/{print ""; split( $0,a,"\n" ); for( i in a) 
    if( match( a[i], "level[34]" )) print(a[i])}' RS= input
于 2013-02-04T13:29:19.390 回答