5

我有一个包含以下行的文件

I want a lot <*tag 1> more <*tag 2>*cheese *cakes.

我正在尝试删除内部<>而不是外部的 *。标签可能比上面更复杂。例如,<*better *tag 1>

我试过/\bregex\b/s/\*//g了,它适用于标签 1,但不适用于标签 2。那么我怎样才能让它也适用于标签 2?

非常感谢。

4

3 回答 3

3

如果标签中只有一个星号,则简单的解决方案

sed 's/<\([^>]*\)\*\([^>]*\)>/<\1\2>/g'

如果你可以拥有更多,你可以使用 sed goto 标签系统

sed ':doagain s/<\([^>]*\)\*\([^>]*\)>/<\1\2>/g; t doagain'

其中doagain是循环的标签,t doagain是条件跳转到标签 doagain。参考 sed 手册:

t label

 Branch to label only if there has been a successful substitution since the last 
 input line was read or conditional branch was taken. The label may be omitted, in 
 which case the next cycle is started.
于 2013-05-30T17:18:40.917 回答
3

强制性 Perl 解决方案:

perl -pe '$_ = join "",
        map +($i++ % 2 == 0 ? $_ : s/\*//gr),
        split /(<[^>]+>)/, $_;' FILE

附加:

perl -pe 's/(<[^>]+>)/$1 =~ s(\*)()gr/ge' FILE
于 2013-05-30T18:48:55.497 回答
1

awk可以解决您的问题:

awk '{x=split($0,a,/<[^>]*>/,s);for(i in s)gsub(/\*/,"",s[i]);for(j=1;j<=x;j++)r=r a[j] s[j]; print r}' file

更易读的版本:

 awk '{x=split($0,a,/<[^>]*>/,s)
       for(i in s)gsub(/\*/,"",s[i])
       for(j=1;j<=x;j++)r=r a[j] s[j]
       print r}' file

用你的数据测试:

kent$  cat file
I want a lot <*tag 1> more <*tag 2>*cheese *cakes. <*better *tag X*>

kent$  awk '{x=split($0,a,/<[^>]*>/,s);for(i in s)gsub(/\*/,"",s[i]);for(j=1;j<=x;j++)r=r a[j] s[j]; print r}' file
I want a lot <tag 1> more <tag 2>*cheese *cakes. <better tag X>
于 2013-05-30T17:19:42.123 回答