1

I sent a POST message in python based on this answer on SO. Once this is done, I get a resultant XML representation that looks like this from the website:

<status>Active</status>
<registeredname>MyTestName</registeredname>
<companyname>TEST</companyname>
<email>mytestemail@gmail.com</email>
<serviceid>8</serviceid>
<productid>1</productid>
<productname>Some Test Product</productname>
<regdate>2013-08-06</regdate>
<nextduedate>0000-00-00</nextduedate>
<billingcycle>One Time</billingcycle>
<validdomain>testing</validdomain>
<validip>XX.XX.XXX.XX</validip>
<validdirectory>/root</validdirectory>
<configoptions></configoptions>
<customfields></customfields>
<addons></addons>
<md5hash>58z9f70a9d738a98b18d0bf4304ac0c6</md5hash>

Now, I would like to convert this into a python dictionary of the format:

{"status": "Active", "registeredname": "MyTestName".......}

The corresponding PHP code from which I am trying to port has something like this:

preg_match_all('/<(.*?)>([^<]+)<\/\\1>/i', $data, $matches);

My correponding Python code is as follows:

matches = {}
matches = re.findall('/<(.*?)>([^<]+)<\/\\1>/i', data)

'data' is the XML representation that I receive from the server. When I run this, my 'matches' dictionary remains empty. Is there something wrong in the regex statement? Or am I wrong in using re.findall in the first place?

Thanks in advance

4

1 回答 1

3

/从正则表达式中删除前导/尾随。没必要逃/。指定flags=re.IGNORECASE而不是尾随i.

matches = re.findall('<(.*?)>([^<]+)</\\1>', data, flags=re.IGNORECASE)
print(dict(matches))

使用原始字符串,无需转义\

matches = re.findall(r'<(.*?)>([^<]+)</\1>', data, flags=re.IGNORECASE)
print(dict(matches))

两个代码都打印:

{'status': 'Active', 'companyname': 'TEST', ...}

非正则表达式替代:lxml

使用lxml.html而不是lxml.etree因为data不完整。

import lxml.html
print({x.tag:x.text for x in lxml.html.fromstring(data)})
于 2013-08-06T09:05:15.633 回答