我们一直在使用 ckanext-dcat 从远程 json 源中采集,有时一些采集作业没有完成,必须连同该源中的所有数据集一起删除,这不是很方便,但随后一切恢复正常,我不知道是否有办法只删除一个作业。
但现在我在收集消费者日志中得到了这个:
Traceback (most recent call last):
File "/usr/lib/ckan/default/bin/paster", line 9, in <module>
load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 104, in run
invoke(command, command_name, options, args[1:])
File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 143, in invoke
exit_code = runner.run(args)
File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 238, in run
result = self.command()
File "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/commands/harvester.py", line 129, in command
gather_callback(consumer, method, header, body)
File "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/queue.py", line 219, in gather_callback
harvest_object_ids = harvester.gather_stage(job)
File "/usr/lib/ckan/default/src/ckanext-dcat/ckanext/dcat/harvesters.py", line 186, in gather_stage
content = self._get_content(url, harvest_job, page)
File "/usr/lib/ckan/default/src/ckanext-dcat/ckanext/dcat/harvesters.py", line 66, in _get_content
cl = r.headers['content-length']
File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/structures.py", line 54, in __getitem__
return self._store[key.lower()][1]
KeyError: 'content-length
作业完成但没有创建数据集,如果我删除作业并重新收集它会继续运行但永远不会结束,并且其他收集作业也不会更新。
我怎样才能解决这个问题?