python - python元素树提取值不起作用

Question

我正在尝试<paste_key>使用 ElementTree 提取值，但出现以下错误。谁能帮我看看我做错了什么？

from pastebin import PastebinAPI
from xml.etree import cElementTree as ET
import time

x = self.apiobject.pastes_by_user(api_dev_key=self.DEVKEY, api_user_key=self.userkey)
print x
x = ET.fromstring(x)


for key in list(x):
  self.pastekeys.append(key.find('paste_key').text)
print self.pastekeys

错误输出： junk after document element: line 13, column 0

将存在的样本数据x

<paste>
<paste_key>afafafaf</paste_key>
<paste_date>1508796842</paste_date>
<paste_title>1508796842</paste_title>
<paste_size>36096</paste_size>
<paste_expire_date>0</paste_expire_date>
<paste_private>2</paste_private>
<paste_format_long>None</paste_format_long>
<paste_format_short>text</paste_format_short>
<paste_url>https://pastebin.com/afafafaf</paste_url>
<paste_hits>0</paste_hits>
</paste>
<paste>
<paste_key>asdfasdf</paste_key>
<paste_date>1508796842</paste_date>
<paste_title>1508796842</paste_title>
<paste_size>36096</paste_size>
<paste_expire_date>0</paste_expire_date>
<paste_private>2</paste_private>
<paste_format_long>None</paste_format_long>
<paste_format_short>text</paste_format_short>
<paste_url>https://pastebin.com/asdfasdf</paste_url>
<paste_hits>0</paste_hits>
</paste>
...

score 1 · Accepted Answer

如果问题是 xml 结构，请尝试 BeautifulSoup。

如果您的粘贴是一个名为 pastebin_string 的字符串，它将是这样的：

soup = BeautifulSoup(pastebin_string, "html.parser")
pastes = soup.find_all("paste").
for paste in pastes:
    key = paste.find("paste_key")
    print(key.text)

score 0 · Accepted Answer

以下对我有用。感谢@john-gordon 指出这一点

        x = self.apiobject.pastes_by_user(api_dev_key=self.DEVKEY, api_user_key=self.userkey)

        x = x.split("</paste>")
        x = [y + "</paste>\r\n" for y in x]

        for key in x[:-1]:
            paste = ET.fromstring(key)
            self.pastekeys.append(paste.find('paste_key').text)

python - python元素树提取值不起作用

2 回答 2

Related

Reference