0

The following wget command does a good job at recursively crawling the full domain, saving the downloaded files in a single folder, and then deleting it all:

wget --delete-after -r -nd http://www.example.com/

When run from the command line, this works perfectly. When run via PHP's exec (or system, shell_exec, passthru) as follows, it only fetches the index page, but seems to go no deeper than that:

exec('wget --delete-after -r -nd http://www.example.com/');

If this were a permissions issue, I'd think it wouldn't download the index page either, but it does (noticeable when I take out '--delete-after').

There's no robots.txt involved, and no output is shown if I pass it through echo. What am I missing?

4

1 回答 1

0

毕竟这似乎是一个权限问题,因为添加 --directory-prefix 参数修复了它。

wget --delete-after -q -r -nd -P /home/example.com/public_html/tmp/ http://www.example.com

我将前缀设置为 php-fpm 肯定可以访问的目录,而之前坦率地说,我不知道它将文件临时保存在哪里('.' 是默认目录,但那会在哪里呢?)。

于 2013-02-15T20:17:05.780 回答