python - 检查值是否存在

Question

我是 Python 新手，我正在编写一个<td>在 HTML 表中查找行的网络爬虫：

# open CSV with URLS to scrape
csv_file = csv.reader(open('urls.csv', 'rb'), delimiter=',')

names = []
for data in csv_file:
    names.append(data[0])

for name in names:
   html = D.get(name);
   html2 = html
   param = '<br />';
   html2 = html2.replace("<br />", " | ")
   print name

   c = csv.writer(open("darkgrey.csv", "a"))
   for row in xpath.search(html2, '//table/tr[@class="bgdarkgrey"]'):
       cols = xpath.search(row, '/td')
       c.writerow([cols[0], cols[1], cols[2], cols[3], cols[4]])

它所做的只是从 4 个表中获取值'<td>'

问题是，有些表没有cols[2]，cols[3]或者cols[4]

有没有办法，我可以检查这些是否存在？

谢谢

score 2 · Accepted Answer

我并不完全熟悉xpath，但您应该能够检查的长度cols（只要它不是一个在其他方面看起来像序列的非常奇怪的对象）：

 if len(cols) >= 5:
    ...

另一个常见的 Python 习语是试试看。

try:
    c.writerow([cols[0], cols[1], cols[2], cols[3], cols[4]])
except IndexError:
    #failed because `cols` isn't long enough.  Do something else.

最后，假设cols是 a list，您始终可以确保它足够长：

cols.extend(['']*5)

这将用空字符串填充您的列，以便您至少有 5 列（通常更多）。

score 0 · Accepted Answer

0

c.writerow([col[x] for x in range(0,len(col))])

也不要忘记关闭“darkgrey.csv”文件！

于 2013-02-05T15:50:13.970 回答

score 0 · Accepted Answer

另一种可能的方法

c.writerow([cols[0], cols[1], '' if not(cols[2]) else cols[2], '' if not(cols[3]) else cols[3], '' if not(cols[4]) else cols[4]])

score 0 · Accepted Answer

0

也许你想说cols = xpath.search(row, 'td')不cols = xpath.search(row, '/td')？（没有斜线）

于 2013-02-05T16:08:13.750 回答

python - 检查值是否存在

4 回答 4

Related

Reference