google-chrome - 如何在 Windows 10 的 Chrome 60 中使用 Headless Chrome？

Question

我一直在看以下关于无头 Chrome 的文章：
https ://developers.google.com/web/updates/2017/04/headless-chrome

我刚刚将 Windows 10 上的 Chrome 升级到版本 60，但是当我从命令行运行以下任一命令时，似乎什么都没有发生：

chrome --headless --disable-gpu --dump-dom https://www.google.com/
chrome --headless --disable-gpu --print-to-pdf https://www.google.com/

我正在从以下路径（Windows 上 Chrome 的默认安装路径）运行所有这些命令：

C:\Program Files (x86)\Google\Chrome\Application\

当我运行命令时，某些东西似乎会处理一秒钟，但我实际上什么也没看到。我究竟做错了什么？
谢谢。

编辑：

正如 Mark Rajcok 所指出的，如果您添加--enable-logging到--dump-dom命令中，它就会起作用。此外，该--print-to-pdf命令在 Chrome 61.0.3163.79 中也可以正常工作，但您可能必须为输出文件指定不同的路径才能获得保存它的必要权限。

因此，以下两个命令对我有用：

"C:\Program Files (x86)\Google\Chrome\Application\chrome" --headless --disable-gpu --enable-logging --dump-dom https://www.google.com/
"C:\Program Files (x86)\Google\Chrome\Application\chrome" --headless --disable-gpu --print-to-pdf=D:\output.pdf https://www.google.com/

我想下一步是能够通过带有 DOM 选择器之类的 PhantomJS 之类的转储 DOM，但我想这是一个单独的问题。

编辑#2：

值得一提的是，我最近遇到了一个用于 Headless Chrome 的 Node API，名为 Puppeteer ( https://github.com/GoogleChrome/puppeteer )，它非常易于使用并提供了 Headless Chrome 的所有功能。如果您正在寻找一种使用 Headless Chrome 的简单方法，我强烈推荐它。

score 12 · Accepted Answer

这对我有用：

start chrome --enable-logging --headless --disable-gpu --print-to-pdf=c:\misc\output.pdf https://www.google.com/

...但仅使用“start chrome”和“--enable-logging”并指定路径（用于pdf）-并且如果文件夹“misc”存在于c目录中。

补充：... pdf 的路径 - 上面的“c:\misc” - 当然可以替换为任何其他文件夹/目录。

score 9 · Accepted Answer

当前版本（68-70）似乎需要--no-sandbox才能运行，没有它，它们绝对什么都不做，并挂在后台。

我使用的完整命令是：

chrome --headless --user-data-dir=tmp --no-sandbox --enable-logging --dump-dom https://www.google.com/ > file.html
chrome --headless --user-data-dir=tmp --no-sandbox --print-to-pdf=whatever.pdf https://www.google.com/

使用--no-sandbox是一个非常糟糕的主意，你应该只将它用于你信任的网站，但遗憾的是它是使它工作的唯一方法。

--user-data-dir=...使用指定目录而不是默认目录，这可能已被您的常规浏览器使用。

但是，如果您尝试从 HTML 制作 PDF，那么这是相当没用的，因为您无法删除页眉和页脚（包含类似的文本file:///...）并且唯一可行的解决方案是使用Puppeteer。

score 9 · Accepted Answer

使用 Chrome 61.0.3163.79，如果我添加--enable-logging则--dump-dom产生输出：

> "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --enable-logging --headless --disable-gpu --dump-dom https://www.chromestatus.com
<body class="loading" data-path="/features">
<app-drawer-layout fullbleed="">
...
</script>
</body>

如果你想以编程方式控制无头 Chrome，这里有一种使用 Python3 和 Selenium 的方法：

在 Admin cmd 窗口中，安装 Selenium for Python：

C:\Users\Mark> pip install -U selenium

下载ChromeDriver v2.32 并解压。我放了chromedriver.exein C:\Users\Mark，这是我放这个headless.pyPython 脚本的地方：

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument("headless")  # remove this line if you want to see the browser popup
driver = webdriver.Chrome(chrome_options = options)
driver.get('https://www.google.com/')
print(driver.page_source)
driver.quit()  # don't miss this, or chromedriver.exe will keep running!

在普通的 cmd 窗口中运行它：

C:\Users\Mark> python headless.py
<!DOCTYPE html><html xmlns="http://www.w3.org/1999/xhtml" ...
...  lots and lots of stuff here ...
...</body></html>

score 2 · Accepted Answer

你应该很好。查看 Chrome 版本目录下

C:\Program Files (x86)\Google\Chrome\Application\60.0.3112.78

对于命令

chrome --headless --disable-gpu --print-to-pdf https://www.google.com/

C:\Program Files (x86)\Google\Chrome\Application\60.0.3112.78\output.pdf

编辑：在这种情况下，仍然执行 chrome 可执行文件所在的命令

 C:\Program Files (x86)\Google\Chrome\Application\

score 2 · Accepted Answer

我知道这个问题是针对 Windows 的，但由于 Google 将这篇文章作为第一个搜索结果，所以这在 Mac 上有效：

Mac OS X

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --headless --dump-dom 'http://www.google.com'

请注意，您必须输入http，否则将无法正常工作。

google-chrome - 如何在 Windows 10 的 Chrome 60 中使用 Headless Chrome？

7 回答 7

Mac OS X

更多提示

google-chrome - 如何在 Windows 10 的 Chrome 60 中使用 Headless Chrome？

7 回答 7

Mac OS X

更多提示

Related

Reference