-1

我需要从每个标签中删除“tei:”。我的原始 xml 如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<?oxygenRNGSchema="http://www.teic.org/release/xml/tei/custom/schema/relaxng/tei_all.rn"type="xml"?>
<?xml-stylesheet type="text/xsl" href="jerome-html-proof.xsl"?>
<TEI
  xmlns="http://www.tei-c.org/ns/1.0"
  xmlns:tei="http://www.tei-c.org/ns/1.0">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>Chronicles (Latin working edition, based on Helm)</title>
        <author>Jerome</author>
      </titleStmt>
      <publicationStmt>
        <p>Unpublished</p>
      </publicationStmt>
      <sourceDesc>
        <p>PD online text from http://www.tertullian.org/fathers/index.htm#jeromechronicle, entitled
          "Jerome, Chronicle (2005)" and based on pages of Helm's edition indicated in milestone
          elements. </p>
        <p>Source page includes note, "This text was transcribed by JMB. All material on this page
          is in the public domain - copy freely." </p>
      </sourceDesc>
    </fileDesc>
  </teiHeader>
  <text>
    <body>
      <div
        n="preface"
        type="prefatory"> </div>
<table>    
<row role="header">
            <cell ana="abraham"/>
            <cell ana="assyrians">Regnum Assyriorum</cell>
            <cell ana="sacred-history"/>
            <cell ana="hebrews"> Hebraeorum gentis exordium</cell>
            <cell ana="sicyonians"> Regnum Sicyoniorum</cell>
            <cell ana="gentile-history"/>
            <cell ana="egyptians"> Regnum Aegyptiorum</cell>
            <cell ana="adbc"> BC</cell>
</row>   
<row role="regnal">
            <cell/>
            <cell/>
            <cell/>
            <cell/>
            <cell>Sicyoniorum III, TELCHIN, annis XX.</cell>
</row>
<row>
            <cell/>
            <cell>15</cell>
            <cell/>
            <cell>25</cell>
            <cell>1</cell>
            <cell/>
            <cell>25</cell>
            <cell>1992</cell>
</row>
</table>
</body>
</text>
</TEI>

当我运行我的脚本时,我得到了相同的输出,但每个标签中都有“tei:”:

<tei:TEI> 
<tei:text> 
<tei:body> 
<tei:div>
<tei:row role="header">...........

我正在尝试为每一行添加一个不用作标题且不标记标尺更改的值。我的代码是:

    import groovy.xml.StreamingMarkupBuilder
    import groovy.xml.XmlUtil

    def TEI = new XmlSlurper().parse(new File('file.xml'))
    def jeromeRow = new File("file-row.xml")
    def x = 0 


    for (row in TEI.text.body.div.table.row) {
    if (row.@role != 'regnal' && row.@role != 'header'){
    x = x + 1
    row.@n = 'r' + x 
    }
    }

def outputBuilder = new StreamingMarkupBuilder()
String result = outputBuilder.bind{ mkp.yield TEI }
jeromeRow << XmlUtil.serialize(result)

如何防止我的脚本对我的 xml 文件进行这种不需要的更改。

4

2 回答 2

0

如果你改变

def TEI = new XmlSlurper().parse(new File('file.xml'))

def TEI = new XmlSlurper(false, false).parse(new File('file.xml'))

它关闭了 slurper 中的验证和命名空间处理,你应该得到预期的结果

于 2016-01-24T20:40:38.140 回答
0

除了不存在的“表”之外,您的代码看起来是正确的。当我在 groovyConsole 中运行以下命令时,它看起来很好:

import groovy.xml.StreamingMarkupBuilder
import groovy.xml.XmlUtil
def xmlText = """<TEI> 
<text> 
<body> 
<div>
<row role="header">
            <cell ana="abraham"/>
            <cell ana="assyrians">Regnum Assyriorum</cell>
            <cell ana="sacred-history"/>
            <cell ana="hebrews"> Hebraeorum gentis exordium</cell>
            <cell ana="sicyonians"> Regnum Sicyoniorum</cell>
            <cell ana="gentile-history"/>
            <cell ana="egyptians"> Regnum Aegyptiorum</cell>
            <cell ana="adbc"> BC</cell>
</row>   
<row role="regnal">
            <cell/>
            <cell/>
            <cell/>
            <cell/>
            <cell>Sicyoniorum III, TELCHIN, annis XX.</cell>
</row>
<row>
            <cell/>
            <cell>15</cell>
            <cell/>
            <cell>25</cell>
            <cell>1</cell>
            <cell/>
            <cell>25</cell>
            <cell>1992</cell>
</row>
</div>
</body>
</text>
</TEI>"""

def TEI = new XmlSlurper().parseText(xmlText)
def x=1
for (row in TEI.text.body.div.row) {
    if (row.@role != 'regnal' && row.@role != 'header'){
      row.@n = 'r' + x++
    }
}
def outputBuilder = new StreamingMarkupBuilder()
String result = outputBuilder.bind{ mkp.yield TEI }

println XmlUtil.serialize(result)

再次查看您的代码,我看到您最后将数据附加到文件末尾。

jeromeRow << XmlUtil.serialize(result)

可能是您出于某种原因(在未提交的代码中)将空数据附加到已经不正确的文件中吗?

于 2016-01-24T09:40:17.727 回答