python - 铁工和scrapy

Question

我正在尝试使用 scrapy 创建一个 Iron.io 工作者。

根据iron.io，我们需要将代码的所有依赖项放在worker本身中。

我创建了一个名为的文件夹module，其中包含所有 3rd 方模块并通过 pip 安装了scrapy。

pip install scrapy -t module/

当试图通过python module/scrapy/__init__.py我运行scrapy时

Traceback (most recent call last):
  File "module/scrapy/__init__.py", line 10, in <module>
    __version__ = pkgutil.get_data(__package__, 'VERSION').decode('ascii').strip()
  File "/usr/lib/python2.7/pkgutil.py", line 578, in get_data
    loader = get_loader(package)
  File "/usr/lib/python2.7/pkgutil.py", line 464, in get_loader
    return find_loader(fullname)
  File "/usr/lib/python2.7/pkgutil.py", line 474, in find_loader
    for importer in iter_importers(fullname):
  File "/usr/lib/python2.7/pkgutil.py", line 424, in iter_importers
    if fullname.startswith('.'):
AttributeError: 'NoneType' object has no attribute 'startswith'

score 1 · Accepted Answer

如果您没有可用的Scrapy可执行文件，您可以Scrapy通过以下方式运行cmdline：

python module/scrapy/cmdline.py

您也可以从脚本运行Scrapy。这是一个非常详细的答案。

score 0 · Accepted Answer

你可能最好从 IronWorker 代码中使用 Scrapy，而不是从命令行调用它，就像它在http://scrapy.org/的首页或教程中一样：http://doc。 scrapy.org/en/0.24/intro/tutorial.html

要在 IronWorker 中使用它，在完成 pip 安装后，请务必添加：

pip 'scrapy'

到您的 .worker 文件。然后在您的工作脚本中，您将导入它：

import scrapy

然后像上面教程链接中所说的那样使用它。

python - 铁工和scrapy

2 回答 2

Related

Reference