我构建了复杂的项目,其中字段可能是其他项目类型的列表。当我使用默认值导出它时,XmlItemExporter
子列表项以<value>
标签为前缀。我正在寻找如何将子项目标识符分配给这些值标签的示例。
文档的 Item Exporters 页面解释了这句话:
除非在方法中被覆盖,否则通过序列化元素
serialize_field()
内的每个值来导出多值字段。<value>
这是为了方便,因为多值字段很常见。
文档页面还提供了在字段中声明序列化程序和覆盖 Serialize_Field() 方法的简单示例,但两者都适用于单值字段,没有建议如何为多值字段自定义它们。
我在网上搜索了一个如何完成的示例,但我没有找到任何示例。
这是我用于测试的示例项目树:
class Course(scrapy.Item):
title = scrapy.Field()
lessons = scrapy.Field()
class Lesson(scrapy.Item):
session = scrapy.Field()
topic = scrapy.Field()
assignment = scrapy.Field()
class ReadingAssignment(scrapy.Item):
textBook = scrapy.Field()
pages = scrapy.Field()
course = Course()
course['title'] = 'Greatness'
course['lessons'] = []
lesson = Lesson()
lesson['session'] = 'Week 1'
lesson['topic'] = 'Think Great'
lesson['assignment'] = []
reading = ReadingAssignment()
reading['textBook'] = 'Great Book 1'
reading['pages'] = '1-20'
lesson['assignment'].append(reading)
course['lessons'].append(lesson)
lesson = Lesson()
lesson['session'] = 'Week 2'
lesson['topic'] = 'Act Great'
lesson['assignment'] = []
reading = ReadingAssignment()
reading['textBook'] = 'Great Book 2'
reading['pages'] = '21-40'
lesson['assignment'].append(reading)
course['lessons'].append(lesson)
lesson = Lesson()
lesson['session'] = 'Week 3'
lesson['topic'] = 'Look Great'
lesson['assignment'] = []
reading = ReadingAssignment()
reading['textBook'] = 'Great Book 3'
reading['pages'] = '41-60'
lesson['assignment'].append(reading)
course['lessons'].append(lesson)
lesson = Lesson()
lesson['session'] = 'Week 4'
lesson['topic'] = 'Be Great'
lesson['assignment'] = []
reading = ReadingAssignment()
reading['textBook'] = 'Great Book 4'
reading['pages'] = '61-80'
lesson['assignment'].append(reading)
course['lessons'].append(lesson)
输出:
>>> course
{'lessons': [{'assignment': [{'pages': '1-20', 'textBook': 'Great Book 1'}],
'session': 'Week 1',
'topic': 'Think Great'},
{'assignment': [{'pages': '21-40', 'textBook': 'Great Book 2'}],
'session': 'Week 2',
'topic': 'Act Great'},
{'assignment': [{'pages': '41-60', 'textBook': 'Great Book 3'}],
'session': 'Week 3',
'topic': 'Look Great'},
{'assignment': [{'pages': '61-80', 'textBook': 'Great Book 4'}],
'session': 'Week 4',
'topic': 'Be Great'}],
'title': 'Greatness'}
当我运行它时,XmlItemExporter
我得到:
<?xml version="1.0" encoding="utf-8"?>
<items>
<course>
<title>Greatness</title>
<lessons>
<value>
<session>Week 1</session>
<topic>Think Great</topic>
<assignment>
<value>
<textBook>Great Book 1</textBook>
<pages>1-20</pages>
</value>
</assignment>
</value>
<value>
<session>Week 2</session>
<topic>Act Great</topic>
<assignment>
<value>
<textBook>Great Book 2</textBook>
<pages>21-40</pages>
</value>
</assignment>
</value>
<value>
<session>Week 3</session>
<topic>Look Great</topic>
<assignment>
<value>
<textBook>Great Book 3</textBook>
<pages>41-60</pages>
</value>
</assignment>
</value>
<value>
<session>Week 4</session>
<topic>Be Great</topic>
<assignment>
<value>
<textBook>Great Book 4</textBook>
<pages>61-80</pages>
</value>
</assignment>
</value>
</lessons>
</course>
</items>
我想做的是将这些<value>
标签更改为附加到列表中的项目的名称。像这样:
<items>
<course>
<title>Greatness</title>
<lessons>
<lesson>
<session>Week 1</session>
<topic>Think Great</topic>
<assignment>
<reading>
<textBook>Great Book 1</textBook>
<pages>1-20</pages>
</reading>
</assignment>
</lesson>
<lesson>
<session>Week 2</session>
<topic>Act Great</topic>
<assignment>
<reading>
<textBook>Great Book 2</textBook>
<pages>21-40</pages>
</reading>
</assignment>
</lesson>
<lesson>
<session>Week 3</session>
<topic>Look Great</topic>
<assignment>
<reading>
<textBook>Great Book 3</textBook>
<pages>41-60</pages>
</reading>
</assignment>
</lesson>
<lesson>
<session>Week 4</session>
<topic>Be Great</topic>
<assignment>
<reading>
<textBook>Great Book 4</textBook>
<pages>61-80</pages>
</reading>
</assignment>
</lesson>
</lessons>
</course>
</items>