1

I have to compare two XML files using Python. Each has a list of items and I have to output which items do not appear in both. Each item has various properties which need to agree to see if it's the same item.

Which parser would be the most suitable. It has to already be included in Python 2.7. I looked at etree but does it enable me to do what I want easily? Or is there something else that would be more suitable. Thanks!

4

2 回答 2

0

这取决于,如果您的 xml 元素有子元素,并且该子元素也需要进行比较,请使用 DOM,

您要比较的元素只有属性,使用 SAX 是最好的方法,我在这里发布了一些 SAX 代码,您可以参考:

import xml.sax
from xml.sax.handler import ContentHandler

class TableHandler(ContentHandler):
    def __init__(self):
        self.columns = {}

    def startElement(self, name, attrs):
        if name == 'R':
            for k, v in attrs.items():
                if not self.columns.has_key(k):
                    self.columns[k] = []
                self.columns[k].append(v)

def xml_to_table(xml_str):
    handler = TableHandler()
    xml.sax.parseString(xml_str, handler)
    return handler.columns

if __name__ == '__main__':    
    txt = """<xml>
    <R CatalogId="8"/><R CatalogId="8"/><R CatalogId="7"/>
    </xml>
    """

    columns = xml_to_table(txt)
    print columns
于 2012-07-04T09:45:21.667 回答
0

您可以使用lxml。您可以浏览第一个文件的项目并检查它们是否在第二个文件中xml.find(".//itemname")

于 2012-07-04T09:45:29.467 回答