python - 将项目写入scrapy中的csv文件时如何对列表中的多个项目进行编码

Question

我是scrapy的新手，我抓取了一个网站并获取了所有必需的项目，需要将它们写入csv文件。

我的pipeline.py代码是

import csv

class example2Pipeline(object):

    def __init__(self):
        self.brandCategoryCsv = csv.writer(open('example.csv', 'wb'))
        self.brandCategoryCsv.writerow(['book_name','dimensions'])

    def process_item(self, item, spider):
        self.brandCategoryCsv.writerow([item['book_name'].encode('utf-8'),
                                    item['dimensions'].encode('utf-8'),
                                    ])
        return item

并且上述项目的文件中xpath代码的结果是spider.py

book_name = i.select('div[@class="slickwrap full"]/div[@id="bookstore_detail"]/div[@class="book_listing clearfix"]/div[@class="bookstore_right"]/div[@class="title_and_byline"]/p[@class="book_title"]/text()').extract()
Result : [u'Rahul']

dimensions = i.select('div[@class="slickwrap full"]/div[@id="bookstore_detail"]/div[@id="main_tab_group"]/div[@class="panes slickshadow"]/div[@class="pane clearfix"]/div[@class="clearfix"]/div[@class="about_author"]/div[@id="book_stats"]/p/a/text()')[0:2].extract()
Result: [u'Pocket',u'Science Fiction &amp; Fantasy',u' 26 pgs']

如果您在上面观察到该book_name项目，则列表中只有一项，因此如果我们使用 book_name[0]，我们将能够通过我在pipeline.py文件中编写的代码对字符串进行编码

但是对于该 dimensions项目，我们在列表中有多个字符串，所以当我运行上面的pipeline.py代码时出现以下错误

exceptions.AttributeError: 'list' object has no attribute 'encode'

那就是我们无法对列表进行编码，我无法对pipeline.py文件中列表中的各个元素进行编码。

我还想将一行中的每列 1 个项目写入 csv 文件，例如

book_name  |   dimensions

Pocket         Science Fiction &amp; Fantasy,  26 pgs

如果您想要我的spider文件的其他代码，我将粘贴在这里。

任何帮助将不胜感激，在此先感谢

score 1 · Accepted Answer

如果列表中只有 unicode 字符串，请尝试 " ".join(somelist)，然后从那里对其进行编码或 str 。

score 0 · Accepted Answer

试试下面..

(item['book_name']).encode('utf-8') ### make sure item['book_name'] is string/unicde becoz they(string/unicode) have encode method not list.

对于换行符，您可以尝试..

self.brandCategoryCsv = csv.writer(open('example.csv', 'wb', newline=''))

要对列表中的每个项目进行编码，请使用下面的代码。

[i.encode('utf-8') for i in item['dimensions']]

score 0 · Accepted Answer

试试python的map函数

 def process_item(self, item, spider):
    self.brandCategoryCsv.writerow([map(lambda x: x.encode('utf-8'), item['book_name']),
                                    map(lambda x: x.encode('utf-8'), item['dimensions']),
                                    ])
    return item

python - 将项目写入scrapy中的csv文件时如何对列表中的多个项目进行编码

3 回答 3

Related

Reference