2

我有这个文件(dev1.temp):

 <?xml version="1.0" encoding="UTF-8"?>
<krpano version="1.0.8.15" showerrors="false">

          <include url="include/sa/index.xml" /> <include url="content/sa.xml" />
          <include url="include/global/index.xml" />
          <include url="include/orientation/index.xml" />
          <include url="include/movecamera/index.xml" /> <include url="content/movecamera.xml" />
          <include url="include/fullscreen/index.xml" />
          <include url="include/instructions/index.xml" />
          <include url="include/coordfinder/index.xml" />
          <include url="include/editor_and_options/index.xml" />
</krpano>

目标是获取所有 url 的内容并将它们放在一个临时文件 (devel.temp) 中。输出将是:

include/sa/index.xml
content/sa.xml
include/global/index.xml
include/orientation/index.xml
include/movecamera/index.xml
content/movecamera.xml
include/fullscreen/index.xml
include/instructions/index.xml
include/coordfinder/index.xml
include/editor_and_options/index.xml

要做到这一点,我有以下脚本:

# Make a temp file with all the files url's    
grep -o 'url=['"'"'"][^"'"'"']*['"'"'"]' $temp_folder"/devel1.temp" > $temp_folder"/devel2.temp"
# Strip off everything to leave just the url's'    
sed -e 's/^url=["'"'"']//' -e 's/["'"'"']$//' $temp_folder"/devel2.temp" > $temp_folder"/devel.temp"

昨天它工作得很好。今天,devel2.temp 和 devel.temp 输出是这样的:

[01;31m[Kurl="include/sa/index.xml"[m[K
[01;31m[Kurl="content/sa.xml"[m[K
[01;31m[Kurl="include/global/index.xml"[m[K
[01;31m[Kurl="include/orientation/index.xml"[m[K
[01;31m[Kurl="include/movecamera/index.xml"[m[K
[01;31m[Kurl="content/movecamera.xml"[m[K
[01;31m[Kurl="include/fullscreen/index.xml"[m[K
[01;31m[Kurl="include/instructions/index.xml"[m[K
[01;31m[Kurl="include/coordfinder/index.xml"[m[K
[01;31m[Kurl="include/editor_and_options/index.xml"[m[K

关于发生了什么的任何想法?

4

4 回答 4

3

考虑使用 xml 目标工具,例如 xpath。我建议这样做:

xpath -e "/krpano/include/@url" -q yourFile.xml | cut -f 2 -d "=" | sed 's/"//

如果您确定 xml 将具有' 仅具有属性的krpanoroot 。您也可以使用下面的速记,但上面的运行速度会更快。includeurl

xpath -e "//@url" -q yourFile.xml | cut -f 2 -d "=" | sed 's/"//
于 2012-10-02T10:33:50.420 回答
2

grep即使输出不是终端,似乎也使用 ANSI 序列为其输出着色。将其--color从更改alwaysauto

grep您应该使用可识别 XML 的工具,而不是用于处理 XML。例如,在xsh中,您可以编写

open file.xml ;
perl { use Term::ANSIColor } ;
for /krpano/include
    echo :s { color('bright_yellow') }
            @url
            { color('reset') } ;
于 2012-10-02T10:17:22.810 回答
2

除了choroba的评论。您的 ANSI 序列,我会尽可能避免通过 sed 等解析 XML,并寻求使用 XML 感知脚本工具。我使用XMLStarlet 工具包。这意味着您的脚本可以识别字符编码/实体,并且在更改 XML 时更加健壮。

于 2012-10-02T10:19:59.567 回答
1

第三个支持 xml 的脚本工具是我的Xidel

xidel /tmp/your.xml -e //@url

(与大多数情况相反,它支持 XPath 2.0,尽管这对于这个问题来说是多余的)

于 2012-10-02T18:56:52.107 回答