0

How would I be able to manipulate the output text of grep.

Right now I am using the command:

grep -i "<url>" $file  >> ./txtFiles/$file.txt

This would output something like this:

<url>http://www.simplyrecipes.com/recipes/chicken_curry_salad/</url>

and then the next text will go to the next line.

How would I be able to get rid of the <url> and </url> and stop it from going to the next line at the end.

4

2 回答 2

2
sed '/<\/*url>/!d;s///g'
  • <\/*url>匹配开始和结束标签
  • 删除没有这个的行
  • 然后删除此模式的所有案例

以您的示例为例,它可能看起来像这样

sed '/<\/*url>/!d;s///g' $file >> ./txtFiles/$file.txt
于 2013-04-25T05:49:21.180 回答
0

单个命令:

sed -in '/<url>/ { s|<url>\(.*\)</url>|\1| ; p ; }' INPUT > OUTPUT

或者使用 awk:

awk -F "</?url>" '/<url>/ { print $2 }' INPUT > OUTPUT

注意<url>...</url>:如果单行上出现多个模式,两者都可能给您无效的输出。如果包含任何管道 ( ) 字符,sed则版本可能会失败。<url>...</url>|

于 2013-04-25T07:30:39.027 回答