python - 我如何使用scrapy shell来处理url上的参数

Question

我想废弃招聘网站。我想在scrapy shell中做一些测试。

因此，如果我输入这个

scrapy shell http://www.seek.com.au

然后如果我输入

from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor

然后它工作正常

但如果我这样做

scrapy shell http://www.seek.com.au/JobSearch?DateRange=31&SearchFrom=quick&Keywords=python&nation=3000

然后如果我输入

from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor

然后它说无效from的bash命令并退出scrapy作业并作为已停止的作业出现在屏幕上

>>> from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
-bash: from: command not found

[5]+  Stopped                 scrapy shell http://www.seek.com.au/JobSearch?DateRange=31
[7]   Done                    Keywords=php

score 9 · Accepted Answer

显然，您需要用双引号将您的网址括起来：

scrapy shell "http://www.seek.com.au/JobSearch?DateRange=31&SearchFrom=quick&Keywords=python&nation=3000"
>>> from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
>>> lx = SgmlLinkExtractor()

然后一切顺利（上面是我的实际shell输出）

在没有双引号的情况下尝试过，不起作用（提取线程继续运行，第一次按键退出到 bash 而不改变我的视觉输出，因此给了我同样的错误）

python - 我如何使用scrapy shell来处理url上的参数

1 回答 1

Related

Reference