我们正在开发一种网络爬虫类型的东西,用户输入网站的 url,我们的网络应用程序会生成网站的屏幕截图。我们使用 phantomjs 的渲染来生成 PNG 格式的屏幕截图。尽管在大多数情况下它就像一个魅力,但有些网站没有正确呈现。例如,如果您使用http://dorevi.lt/
它在浏览器中显示的网站:
然而 phantom 渲染的截图如下:
您可以看到它拉伸了中心表并打破了中间的内容。到目前为止,我尝试过的是:
试图在页面读取和页面渲染之间放置各种延迟,甚至长达 30 秒,但没有运气。
尝试了这个答案中的所有解决方案,我们等待加载 DOM 内容(内部 stlysheets 等),但同样的输出
尝试在执行 phanjomjs 脚本时添加所有可能的参数,这就是我的最终命令的样子:
phantomjs.exe --ignore-ssl-errors=true --load-images=true --ssl-protocol=any --debug=true --local-to-remote-url-access=true --web-security=false --disk-cache=false script.js
如您所见,我也使用了所有可能的标志,但输出仍然相同。请帮助我,因为我们需要确保生成准确的网页截图。
信息:使用的 Phantomjs 版本:2.1 操作系统:CentOS 用于生产,也在 Windows 7 上进行测试,输出相同 技术:应用程序用于构建 PHP
编辑 1:添加 --debug=true 输出
2017-12-09T15:31:40 [DEBUG] CookieJar - Created but will not store cookies (use
option '--cookies-file=<filename>' to enable persistent cookie storage)
2017-12-09T15:31:41 [DEBUG] Set "http" proxy to: "" : 1080
2017-12-09T15:31:41 [DEBUG] Phantom - execute: Configuration
2017-12-09T15:31:41 [DEBUG] 0 objectName : ""
2017-12-09T15:31:41 [DEBUG] 1 cookiesFile : ""
2017-12-09T15:31:41 [DEBUG] 2 diskCacheEnabled : "true"
2017-12-09T15:31:41 [DEBUG] 3 maxDiskCacheSize : "-1"
2017-12-09T15:31:41 [DEBUG] 4 diskCachePath : ""
2017-12-09T15:31:41 [DEBUG] 5 ignoreSslErrors : "true"
2017-12-09T15:31:41 [DEBUG] 6 localUrlAccessEnabled : "true"
2017-12-09T15:31:41 [DEBUG] 7 localToRemoteUrlAccessEnabled : "true"
2017-12-09T15:31:41 [DEBUG] 8 outputEncoding : "UTF-8"
2017-12-09T15:31:41 [DEBUG] 9 proxyType : "http"
2017-12-09T15:31:41 [DEBUG] 10 proxy : ":1080"
2017-12-09T15:31:41 [DEBUG] 11 proxyAuth : ":"
2017-12-09T15:31:41 [DEBUG] 12 scriptEncoding : "UTF-8"
2017-12-09T15:31:41 [DEBUG] 13 webSecurityEnabled : "false"
2017-12-09T15:31:41 [DEBUG] 14 offlineStoragePath : ""
2017-12-09T15:31:41 [DEBUG] 15 localStoragePath : ""
2017-12-09T15:31:41 [DEBUG] 16 localStorageDefaultQuota : "-1"
2017-12-09T15:31:41 [DEBUG] 17 offlineStorageDefaultQuota : "-1"
2017-12-09T15:31:41 [DEBUG] 18 printDebugMessages : "true"
2017-12-09T15:31:41 [DEBUG] 19 javascriptCanOpenWindows : "true"
2017-12-09T15:31:41 [DEBUG] 20 javascriptCanCloseWindows : "true"
2017-12-09T15:31:41 [DEBUG] 21 sslProtocol : "any"
2017-12-09T15:31:41 [DEBUG] 22 sslCiphers : "ECDHE-ECDSA-AES128-GCM-SHA256:
ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA:ECD
HE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-RC4-SH
A:ECDHE-RSA-RC4-SHA:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA:DHE-RSA-AES256-SHA:AES
128-GCM-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:RC4-SHA:RC4-MD5"
2017-12-09T15:31:41 [DEBUG] 23 sslCertificatesPath : ""
2017-12-09T15:31:41 [DEBUG] 24 sslClientCertificateFile : ""
2017-12-09T15:31:41 [DEBUG] 25 sslClientKeyFile : ""
2017-12-09T15:31:41 [DEBUG] 26 sslClientKeyPassphrase : ""
2017-12-09T15:31:41 [DEBUG] 27 webdriver : ":"
2017-12-09T15:31:41 [DEBUG] 28 webdriverLogFile : ""
2017-12-09T15:31:41 [DEBUG] 29 webdriverLogLevel : "INFO"
2017-12-09T15:31:41 [DEBUG] 30 webdriverSeleniumGridHub : ""
2017-12-09T15:31:41 [DEBUG] Phantom - execute: Script & Arguments
2017-12-09T15:31:41 [DEBUG] script: "script.js"
2017-12-09T15:31:41 [DEBUG] Phantom - execute: Starting normal mode
2017-12-09T15:31:41 [DEBUG] WebPage - setupFrame ""
2017-12-09T15:31:41 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode",
QVariant(QString, "r")))
2017-12-09T15:31:41 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mod
e", QVariant(QString, "r")))
2017-12-09T15:31:41 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mo
de", QVariant(QString, "r")))
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 10
2017-12-09T15:31:42 [DEBUG] CookieJar - Saved "CMSSESSID8694f4a4=kpca79mq05g4v0f
nh31uvkmu86; domain=dorevi.lt; path=/"
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 30
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 32
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 35
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 37
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 39
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 41
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 43
2017-12-09T15:31:42 [DEBUG] WebPage - setupFrame ""
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 46
2017-12-09T15:31:42 [DEBUG] WebPage - updateLoadingProgress: 48
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 52
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 55
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 58
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 60
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 63
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 67
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 69
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 71
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 74
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 76
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 78
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 81
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 83
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 85
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 87
2017-12-09T15:31:43 [DEBUG] WebPage - updateLoadingProgress: 100
2017-12-09T15:31:43 [DEBUG] CookieJar - Saved "CMSSESSID8694f4a4=kpca79mq05g4v0f
nh31uvkmu86; domain=dorevi.lt; path=/"
2017-12-09T15:31:43 [DEBUG] CookieJar - Saved "_ga=GA1.2.690650226.1512813703; e
xpires=Mon, 09-Dec-2019 10:01:43 GMT; domain=.dorevi.lt; path=/"
2017-12-09T15:31:43 [DEBUG] CookieJar - Saved "CMSSESSID8694f4a4=kpca79mq05g4v0f
nh31uvkmu86; domain=dorevi.lt; path=/"
2017-12-09T15:31:43 [DEBUG] CookieJar - Saved "_ga=GA1.2.690650226.1512813703; e
xpires=Mon, 09-Dec-2019 10:01:43 GMT; domain=.dorevi.lt; path=/"
2017-12-09T15:31:43 [DEBUG] CookieJar - Saved "_gid=GA1.2.860165508.1512813703;
expires=Sun, 10-Dec-2017 10:01:43 GMT; domain=.dorevi.lt; path=/"
2017-12-09T15:31:43 [DEBUG] CookieJar - Saved "CMSSESSID8694f4a4=kpca79mq05g4v0f
nh31uvkmu86; domain=dorevi.lt; path=/"
2017-12-09T15:31:43 [DEBUG] CookieJar - Saved "_ga=GA1.2.690650226.1512813703; e
xpires=Mon, 09-Dec-2019 10:01:43 GMT; domain=.dorevi.lt; path=/"
2017-12-09T15:31:43 [DEBUG] CookieJar - Saved "_gid=GA1.2.860165508.1512813703;
expires=Sun, 10-Dec-2017 10:01:43 GMT; domain=.dorevi.lt; path=/"
2017-12-09T15:31:43 [DEBUG] CookieJar - Saved "_gat=1; expires=Sat, 09-Dec-2017
10:02:43 GMT; domain=.dorevi.lt; path=/"
2017-12-09T15:31:43 [DEBUG] WebPage - setupFrame ""
2017-12-09T15:31:53 [DEBUG] WebPage - setupFrame ""
2017-12-09T15:31:53 [DEBUG] WebPage - updateLoadingProgress: 10
2017-12-09T15:31:53 [DEBUG] WebPage - setupFrame ""
2017-12-09T15:31:53 [DEBUG] WebPage - updateLoadingProgress: 100
2017-12-09T15:31:53 [DEBUG] WebPage - setupFrame ""
2017-12-09T15:31:53 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode",
QVariant(QString, "r")))
2017-12-09T15:31:53 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mod
e", QVariant(QString, "r")))
2017-12-09T15:31:53 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mo
de", QVariant(QString, "r")))
2017-12-09T15:31:53 [DEBUG] WebPage - updateLoadingProgress: 10
2017-12-09T15:31:53 [DEBUG] WebPage - setupFrame ""
2017-12-09T15:31:53 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode",
QVariant(QString, "r")))
2017-12-09T15:31:53 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mod
e", QVariant(QString, "r")))
2017-12-09T15:31:53 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mo
de", QVariant(QString, "r")))
2017-12-09T15:31:53 [DEBUG] WebPage - updateLoadingProgress: 100
2017-12-09T15:31:53 [DEBUG] CookieJar - Purged (session) "CMSSESSID8694f4a4=kpca
79mq05g4v0fnh31uvkmu86; domain=dorevi.lt; path=/"
2017-12-09T15:31:53 [DEBUG] CookieJar - Saved "_ga=GA1.2.690650226.1512813703; e
xpires=Mon, 09-Dec-2019 10:01:43 GMT; domain=.dorevi.lt; path=/"
2017-12-09T15:31:53 [DEBUG] CookieJar - Saved "_gid=GA1.2.860165508.1512813703;
expires=Sun, 10-Dec-2017 10:01:43 GMT; domain=.dorevi.lt; path=/"
2017-12-09T15:31:53 [DEBUG] CookieJar - Saved "_gat=1; expires=Sat, 09-Dec-2017
10:02:43 GMT; domain=.dorevi.lt; path=/"