我在循环通过 pickle 读取的列表时遇到问题。这段代码的最终目的是遍历每个项目并返回每个项目的 id 号。
## Opening the file, and loading it into a list##
with open('TEMP_ITEMS.txt', 'rb') as openfile:
items = pickle.load(openfile)
我尝试循环遍历并查找 id 编号的尝试是基于一些旧的 xml 抓取技术,但由于某种原因,该逻辑不适用于此处。
for item in enumerate(items):
pattern0 = re.compile('ID: (.*?) <br>')
idnumber = float(re.findall(pattern0, items[0])[0])
print "ID Number: ",idnumber
TEMP_ITEMS.txt 内容示例
(lp0
S'\n <item>\n <title>Timmy</title>\n <link>caturl</link>\n <description><![CDATA[\n Timmy <br>\n ID: 3712 <br>\n Age: 10 <br>\n Weight: 7lbs <br>\n Time: 17:23 <br>\n Cat Name: Timmy <br>\n\n ]]></description>\n <guid isPermaLink="false">04e72b29-065d-4893-a4d2-f16ff30a283e</guid>\n <pubDate>Fri, 21 Jun 2013 01:09:05 GMT</pubDate>\n </item>'
p1
aS'\n <item>\n <title>George</title>\n <link>caturl</link>\n <description><![CDATA[\n George <br>\n ID: 4124 <br>\n Age: 14 <br>\n Weight: 8lbs <br>\n Time: 15:41 <br>\n Cat Name: George <br>\n\n ]]></description>\n <guid isPermaLink="false">212f9fbf-564b-470a-a64a-ef51036ff06a</guid>\n <pubDate>Fri, 21 Jun 2013 01:28:20 GMT</pubDate>\n </item>'
p2
a.
任何有关此问题的帮助或建议将不胜感激。亲切的问候 AEA
根据 falsetru 的建议使用的代码,返回错误
import pickle
import re
with open('TEMP_RSS_ITEMS.txt', 'rb') as temp_rss_items_open4:
items = pickle.load(temp_rss_items_open4)
print items
for item in enumerate(items):
pattern0 = re.compile('ID: (.*) <br>')
for idnumber in re.findall(pattern0, item):
print idnumber
它产生的错误代码:
Traceback (most recent call last):
File "C:/Sharing/test1.py", line 9, in <module>
for idnumber in re.findall(pattern0, item):
File "C:\Python27\lib\re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
>>>