11

我正在尝试制作一个 dict 类来处理 xml,但被卡住了,我真的没有想法了。如果有人可以指导这个主题会很棒。

到目前为止开发的代码:

class XMLResponse(dict):
    def __init__(self, xml):
        self.result = True
        self.message = ''
        pass

    def __setattr__(self, name, val):
        self[name] = val

    def __getattr__(self, name):
        if name in self:
            return self[name]
        return None

message="<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"
XMLResponse(message)
4

4 回答 4

21

您可以使用xmltodict模块:

import xmltodict

message = """<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"""
print xmltodict.parse(message)['note']

产生一个OrderedDict

OrderedDict([(u'to', u'Tove'), (u'from', u'Jani'), (u'heading', u'Reminder'), (u'body', u"Don't forget me this weekend!")])

如果顺序无关紧要,可以将其转换为 dict:

print dict(xmltodict.parse(message)['note'])

印刷:

{u'body': u"Don't forget me this weekend!", u'to': u'Tove', u'from': u'Jani', u'heading': u'Reminder'}
于 2013-06-18T19:40:50.553 回答
7

你会认为到现在我们已经对这个问题有了一个很好的答案,但我们显然没有。在查看了关于 stackoverflow 的半打类似问题之后,这对我有用:

from lxml import etree
# arrow is an awesome lib for dealing with dates in python
import arrow


# converts an etree to dict, useful to convert xml to dict
def etree2dict(tree):
    root, contents = recursive_dict(tree)
    return {root: contents}


def recursive_dict(element):
    if element.attrib and 'type' in element.attrib and element.attrib['type'] == "array":
        return element.tag, [(dict(map(recursive_dict, child)) or getElementValue(child)) for child in element]
    else:
        return element.tag, dict(map(recursive_dict, element)) or getElementValue(element)


def getElementValue(element):
    if element.text:
        if element.attrib and 'type' in element.attrib:
            attr_type = element.attrib.get('type')
            if attr_type == 'integer':
                return int(element.text.strip())
            if attr_type == 'float':
                return float(element.text.strip())
            if attr_type == 'boolean':
                return element.text.lower().strip() == 'true'
            if attr_type == 'datetime':
                return arrow.get(element.text.strip()).timestamp
        else:
            return element.text
    elif element.attrib:
        if 'nil' in element.attrib:
            return None
        else:
            return element.attrib
    else:
        return None

这就是你使用它的方式:

from lxml import etree

message="""<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"''
tree = etree.fromstring(message)
etree2dict(tree)

希望能帮助到你 :-)

于 2016-01-27T01:20:01.880 回答
6

你应该结帐

https://github.com/martinblech/xmltodict

我认为这是我见过的 xml 到 dict 的最佳标准处理程序之一。

但是我应该警告你 xml 和 dict 不是绝对兼容的数据结构

于 2013-06-18T19:32:21.133 回答
3

您可以使用lxml 库objectify.fromstring使用然后查找对象 dir 方法将字符串转换为 xml 对象。例如:

from lxml import objectify

xml_string = """<?xml version="1.0" encoding="UTF-8"?><NewOrderResp><IndustryType></IndustryType><MessageType>R</MessageType><MerchantID>700000005894</MerchantID><TerminalID>0031</TerminalID><CardBrand>AMEX</CardBrand><AccountNum>3456732800000010</AccountNum><OrderID>TESTORDER1</OrderID><TxRefNum>55A69B278025130CD36B3A95435AA84DC45363</TxRefNum><TxRefIdx>10</TxRefIdx><ProcStatus>0</ProcStatus><ApprovalStatus>1</ApprovalStatus><RespCode></RespCode><AVSRespCode></AVSRespCode><CVV2RespCode></CVV2RespCode><AuthCode></AuthCode><RecurringAdviceCd></RecurringAdviceCd><CAVVRespCode></CAVVRespCode><StatusMsg></StatusMsg><RespMsg></RespMsg><HostRespCode></HostRespCode><HostAVSRespCode></HostAVSRespCode><HostCVV2RespCode></HostCVV2RespCode><CustomerRefNum>A51C5B2B1811E5991208</CustomerRefNum><CustomerName>BOB STEVEN</CustomerName><ProfileProcStatus>0</ProfileProcStatus><CustomerProfileMessage>Profile Created</CustomerProfileMessage><RespTime>13055</RespTime><PartialAuthOccurred></PartialAuthOccurred><RequestedAmount></RequestedAmount><RedeemedAmount></RedeemedAmount><RemainingBalance></RemainingBalance><CountryFraudFilterStatus></CountryFraudFilterStatus><IsoCountryCode></IsoCountryCode></NewOrderResp>"""

xml_object = objectify.fromstring(xml_string)

print xml_object.__dict__

将 xml 对象转换为 dict 将返回一个 dict:

{'RemainingBalance': u'', 'AVSRespCode': u'', 'RequestedAmount': u'', 'AccountNum': 3456732800000010, 'IsoCountryCode': u'', 'HostCVV2RespCode': u'', 'TerminalID': 31, 'CVV2RespCode': u'', 'RespMsg': u'', 'CardBrand': 'AMEX', 'MerchantID': 700000005894, 'RespCode': u'', 'ProfileProcStatus': 0, 'CustomerName': 'BOB STEVEN', 'PartialAuthOccurred': u'', 'MessageType': 'R', 'ProcStatus': 0, 'TxRefIdx': 10, 'RecurringAdviceCd': u'', 'IndustryType': u'', 'OrderID': 'TESTORDER1', 'StatusMsg': u'', 'ApprovalStatus': 1, 'RedeemedAmount': u'', 'CountryFraudFilterStatus': u'', 'TxRefNum': '55A69B278025130CD36B3A95435AA84DC45363', 'CustomerRefNum': 'A51C5B2B1811E5991208', 'CustomerProfileMessage': 'Profile Created', 'AuthCode': u'', 'RespTime': 13055, 'HostAVSRespCode': u'', 'CAVVRespCode': u'', 'HostRespCode': u''}

我使用的 xml 字符串是 paymentech 支付网关的响应,只是为了展示一个真实的例子。

另请注意,上面的示例不是递归的,因此如果 dicts 中有 dicts,则必须进行一些递归。请参阅我编写的可以使用的递归函数:

from lxml import objectify

def xml_to_dict_recursion(xml_object):
    dict_object = xml_object.__dict__
    if not dict_object:
        return xml_object
    for key, value in dict_object.items():
        dict_object[key] = xml_to_dict_recursion(value)
    return dict_object

def xml_to_dict(xml_str):
    return xml_to_dict_recursion(objectify.fromstring(xml_str))

xml_string = """<?xml version="1.0" encoding="UTF-8"?><Response><NewOrderResp>
<IndustryType>Test</IndustryType><SomeData><SomeNestedData1>1234</SomeNestedData1>
<SomeNestedData2>3455</SomeNestedData2></SomeData></NewOrderResp></Response>"""

print xml_to_dict(xml_string)

这是一个保留父键/元素的变体:

def xml_to_dict(xml_str):
    """ Convert xml to dict, using lxml v3.4.2 xml processing library, see http://lxml.de/ """
    def xml_to_dict_recursion(xml_object):
        dict_object = xml_object.__dict__
        if not dict_object:  # if empty dict returned
            return xml_object
        for key, value in dict_object.items():
            dict_object[key] = xml_to_dict_recursion(value)
        return dict_object
    xml_obj = objectify.fromstring(xml_str)
    return {xml_obj.tag: xml_to_dict_recursion(xml_obj)}

如果您只想返回一个子树并将其转换为 dict,您可以使用Element.find()

xml_obj.find('.//')  # lxml.objectify.ObjectifiedElement instance

有很多选项可以实现这一点,但如果您已经在使用 lxml,这个选项非常棒。在这个例子中使用了 lxml-3.4.2。干杯!

于 2015-07-15T18:56:30.320 回答