1

因此,在 linux 的命令行上,我尝试搜索一些 HTML 代码并仅打印代码的动态部分。例如这段代码

<p><span class="RightSideLinks">Tel: 090 97543</span></p>

我只想打印 97543 而不是 090。下次我搜索文件时,代码可能已更改为

<p><span class="RightSideLinks">Tel: 081 82827</span></p>

我只想要 82827。其余的代码保持不变,只是电话号码发生了变化。

我可以使用 grep 来执行此操作吗?谢谢

编辑:

是否也可以在此代码上使用它?

<tr class="patFuncEntry"><td align="left" class="patFuncMark"><input type="checkbox" name="renew0" id="renew0" value="i1061700" /></td><td align="left" class="patFuncTitle"><label for="renew0"><a href="/record=p1234567~S0"> I just want to print this part. </a></label>

改变的是记录号:p1234567~S0"以及我要打印的文本。

4

1 回答 1

1

一种使用方式GNU grep

grep -oP '(?<=Tel: .{3} )[^<]+' file.txt

的示例内容file.txt

<p><span class="RightSideLinks">Tel: 090 97543</span></p>
<p><span class="RightSideLinks">Tel: 081 82827</span></p>

结果:

97543
82827

编辑:

(?<=Tel: .{3} ) ## This is a positive lookbehind assertion, which to be
                ## interpreted must be used with grep's Perl regexp flag, '-P'.

Tel: .{3}       ## So this is what we're actually checking for; the phrase 'Tel: '
                ## followed by any character exactly three times followed by a 
                ## space. Since we're searching only for numbers you could write
                ## 'Tel: [0-9]{3} ' instead.

[^<]+           ## Grep's '-o' flag enables us to return exactly what we want, 
                ## rather than the whole line. Therefore this expression will
                ## return any character except '<' any number of times.

Putting it all together, we're asking grep to return any character except '<' 
any number of times if we can find 'Tel: .{3} ' immediately ahead of it. HTH.
于 2012-10-05T22:47:38.833 回答