1

假设我有一个这样的 XML 文件(bookstore.xml)

<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web" cover="paperback">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

我想删除= JK罗琳的book元素。 我知道我可以像这样(Jython)获得所有匹配作者的元素author

docFactory = DocumentBuilderFactory.newInstance()
docBuilder = docFactory.newDocumentBuilder()
doc = docBuilder.parse(bookstore.xml)
list = doc.getElementsByTagName("author")

我想将修改后的 XML 树写入 bookstore.xml。

谢谢 !

4

3 回答 3

1

我建议不要使用org.w3c.dom.*和Java API,而是使用ElementTree。这个库在 Jython 中得到支持,极大地简化了事情。javax.xml.*

from xml.etree import ElementTree as ET

root = ET.parse("bookstore.xml").getroot()
books = root.findall("book")

for book in books:
    if book.findtext("author") == "J K. Rowling":
        print "Found!"
        root.remove(book)

ET.ElementTree(root).write("output.xml")

使用 Jython 2.5.2(和 CPython 2.7.2)测试。

于 2013-01-09T20:15:04.460 回答
0

下面是python2.7中的操作步骤。但我没有写脚本,因为它过度依赖于你的 xml 结构。

>>> from xml.dom import minidom
>>> xmldoc = minidom.parse('a.xml')
>>> root = xmldoc.documentElement
>>> nodeList = xmldoc.childNodes
>>> bookstore = nodeList[0].childNodes
>>> bookstore
[<DOM Text node "u'\n'">, <DOM Element: book at 0x2544580>, <DOM Text node "u'\n'">, <DOM Element: book at 0x2544a30>, <DOM Text node "u'\n'">, <DOM Element: book at x2544e90>, <DOM Text node "u'\n'" >, <DOM Element: book at 0x25475d0>, <DOM Text node "u'\n'">]
>>> bookstore[3].getElementsByTagName("author")[0].childNodes[0].data
u'J K. Rowling'
>>> nodeList[0].removeChild(bookstore[3])
>>> with open('output.xml', 'w') as f:
...     f.write(xmldoc.saveXML(nodeList[0]))
...
>>> 

结果:

<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>

<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web" cover="paperback">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

我认为这个 dom moudle 使用起来非常复杂。最好与其他人一起尝试,例如Python 中的xml.etree.ElementTree

于 2013-01-09T06:52:32.507 回答
0

以下工作

for i in range(list.getLength()):
    node = list.item(i)
    if node != None and node.getNodeName() == "book":
        children = node.getChildNodes()
        for j in range(children.getLength()):
            print "Looking for J K. Rowling in book"
            child = children.item(j)
            if  child.getNodeName() == "author" and child.getTextContent() == "J K. Rowling":
                print "************"
                print "Found!!!!!"
                print child.getNodeName()
                print node.getTextContent()
                node1= node.getParentNode().removeChild(child.getParentNode())
于 2013-01-09T17:50:25.333 回答