1

好的,所以我正在尝试使用 selenium 解析网站。我将在 python 中执行此操作,当我使用 firefox 浏览器时它可以正常工作。我可以浏览网站并很好地解析它。

page = webdriver.Remote(command_executor='http://127.0.0.1:4444/wd/hub',desired_capabilities=DesiredCapabilities.FIREFOX)

但是,由于我必须获取所有 24something 国家/地区的信息,我希望能够使用 HTMLUNIT 来完成它,但这给了我一个错误。当我运行以下代码时它工作正常

page = webdriver.Remote(command_executor='http://127.0.0.1:4444/wd/hub',desired_capabilities=DesiredCapabilities.HTMLUNITWITHJS)

(注意我放了'HTMLUNITWITHJS'..它仍然给我同样的错误)但是一旦我尝试这样做

page.get("http://www.investmentmap.org")

它给了我以下错误

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/site-packages/selenium-2.25.0-py2.6.egg/selenium/webd
river/remote/webdriver.py", line 168, in get
self.execute(Command.GET, {'url': url})
File "/usr/lib/python2.6/site-packages/selenium-2.25.0-py2.6.egg/selenium/webd
river/remote/webdriver.py", line 156, in execute
self.error_handler.check_response(response)


File "/usr/lib/python2.6/site-packages/selenium-2.25.0-py2.6.egg/selenium/webd
river/remote/errorhandler.py", line 147, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: u'TypeError: Cannot find
 function attachEvent in object [object Window]. (http://www.investmentmap.org/W
ebResource.axd?d=LhblxRSuIsXvwRXVMFTkFY2_MTprvQMTc0m2WmtZ11CqrDE7VxOZIP2JWr5nhbJ
dQc4rLL2xsB8APnsxJfp2YQ57TEMa7scxhAPpUl0DIfBcLe19toyYfpm3QnO4qTRHvpItdEAI7kl7Kci
Y-atDSENblKs1&t=634763918950917029#1)' ; Screenshot: available via screen ; Stac
ktrace: Method constructError threw an error in ScriptRuntime.java

这是我的源代码

import time, threading
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

page = webdriver.Remote(command_executor='http://127.0.0.1:4444/wd/hub',desired_capabilities=DesiredCapabilities.HTMLUNITWITHJS)

page.get("http://www.investmentmap.org")

哦还有一件事。我在 Windows 7 中的 cygwin 上运行 python,但即使我在我的 mac 中执行此操作,我也会遇到同样的错误。我也在使用 selenium-server-standalone java 服务器。这是它转储的

00:03:34.163 INFO - Executing: [new session: {platform=ANY, javascriptEnabled=tr
ue, browserName=htmlunit, version=firefox}] at URL: /session)
00:03:37.025 INFO - Done: /session
00:03:37.036 INFO - Executing: org.openqa.selenium.remote.server.handler.GetSess
ionCapabilities@3fd54ead at URL: /session/1351482881929)
00:03:37.038 INFO - Done: /session/1351482881929
00:04:38.844 INFO - Executing: [get: http://www.investmentmap.org] at URL: /sess
ion/1351482881929/url)
00:04:40.133 WARN - Recursive src attribute of iframe: url=[#]. Ignored.
00:04:40.526 WARN - Obsolete content type encountered: 'text/javascript'.
00:04:42.640 WARN - CSS error: [36:27] Error in style rule. Invalid token ":". W
as expecting one of: <S>, "}", <COMMA>, ";", "/", <PLUS>, "-", <HASH>, <STRING>,
 <URI>, "!", "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LE
NGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIM
E_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FU
NCTION>, <IDENT>.
00:04:42.755 WARN - CSS warning: [36:27] Ignoring the following declarations in
this rule.
00:04:44.868 WARN - Exception thrown
org.openqa.selenium.WebDriverException: com.gargoylesoftware.htmlunit.ScriptExce
ption: TypeError: Cannot find function attachEvent in object [object Window]. (h
ttp://www.investmentmap.org/WebResource.axd?d=LhblxRSuIsXvwRXVMFTkFY2_MTprvQMTc0
m2WmtZ11CqrDE7VxOZIP2JWr5nhbJdQc4rLL2xsB8APnsxJfp2YQ57TEMa7scxhAPpUl0DIfBcLe19to
yYfpm3QnO4qTRHvpItdEAI7kl7KciY-atDSENblKs1&t=634763918950917029#1)
Build info: version: '2.25.0', revision: '17482', time: '2012-07-18 21:08:56'
System info: os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.ver
sion: '1.7.0'
Driver info: driver.version: EventFiringWebDriver
        at org.openqa.selenium.htmlunit.HtmlUnitDriver.get(HtmlUnitDriver.java:3
52)
        at org.openqa.selenium.htmlunit.HtmlUnitDriver.get(HtmlUnitDriver.java:3
33)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.openqa.selenium.support.events.EventFiringWebDriver$2.invoke(Even
tFiringWebDriver.java:101)
        at $Proxy1.get(Unknown Source)
        at org.openqa.selenium.support.events.EventFiringWebDriver.get(EventFiri
ngWebDriver.java:155)
        at org.openqa.selenium.remote.server.handler.ChangeUrl.call(ChangeUrl.ja
va:38)
        at org.openqa.selenium.remote.server.handler.ChangeUrl.call(ChangeUrl.ja
va:1)
        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at org.openqa.selenium.remote.server.DefaultSession$1.run(DefaultSession
.java:150)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot find
 function attachEvent in object [object Window]. (http://www.investmentmap.org/W
ebResource.axd?d=LhblxRSuIsXvwRXVMFTkFY2_MTprvQMTc0m2WmtZ11CqrDE7VxOZIP2JWr5nhbJ
dQc4rLL2xsB8APnsxJfp2YQ57TEMa7scxhAPpUl0DIfBcLe19toyYfpm3QnO4qTRHvpItdEAI7kl7Kci
Y-atDSENblKs1&t=634763918950917029#1)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitCon
textAction.run(JavaScriptEngine.java:595)
        at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:
537)
        at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(Contex
tFactory.java:538)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(Jav
aScriptEngine.java:499)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(Jav
aScriptEngine.java:474)
        at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptIfPossib
le(HtmlPage.java:870)
        at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNe
eded(HtmlScript.java:302)
        at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(H
tmlScript.java:368)
        at com.gargoylesoftware.htmlunit.html.HtmlScript$1.execute(HtmlScript.ja
va:230)
        at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPag
e(HtmlScript.java:240)
        at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endE
lement(HTMLParser.java:598)
        at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source
)
        at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endE
lement(HTMLParser.java:556)
        at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.jav
a:1142)
        at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:10
44)
        at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.jav
a:206)
        at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder
.java:329)
        at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScan
ner.java:3018)
        at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2
005)
        at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:908)
        at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499
)
        at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452
)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.pars
e(HTMLParser.java:789)
        at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:2
25)
        at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.ja
va:179)
        at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(Defau
ltPageCreator.java:221)
        at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPa
geCreator.java:106)
        at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient
.java:433)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:311)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
        at org.openqa.selenium.htmlunit.HtmlUnitDriver.get(HtmlUnitDriver.java:3
46)
        ... 16 more
Caused by: net.sourceforge.htmlunit.corejs.javascript.EcmaError: TypeError: Cann
ot find function attachEvent in object [object Window]. (http://www.investmentma
p.org/WebResource.axd?d=LhblxRSuIsXvwRXVMFTkFY2_MTprvQMTc0m2WmtZ11CqrDE7VxOZIP2J
Wr5nhbJdQc4rLL2xsB8APnsxJfp2YQ57TEMa7scxhAPpUl0DIfBcLe19toyYfpm3QnO4qTRHvpItdEAI
7kl7KciY-atDSENblKs1&t=634763918950917029#1)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructErr
or(ScriptRuntime.java:3790)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructErr
or(ScriptRuntime.java:3768)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError(Sc
riptRuntime.java:3796)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError2(S
criptRuntime.java:3815)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.notFunctionE
rror(ScriptRuntime.java:3885)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getPropFunct
ionAndThisHelper(ScriptRuntime.java:2363)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getPropFunct
ionAndThis(ScriptRuntime.java:2330)
        at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(
Interpreter.java:1514)
        at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Inte
rpreter.java:854)
        at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(I
nterpretedFunction.java:164)
        at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(C
ontextFactory.java:429)
        at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTop
Call(HtmlUnitContextFactory.java:267)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(Sc
riptRuntime.java:3183)
        at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(I
nterpretedFunction.java:175)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$3.doRun(Jav
aScriptEngine.java:490)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitCon
textAction.run(JavaScriptEngine.java:589)
        ... 47 more
00:04:44.918 WARN - Exception: TypeError: Cannot find function attachEvent in ob
ject [object Window]. (http://www.investmentmap.org/WebResource.axd?d=LhblxRSuIs
XvwRXVMFTkFY2_MTprvQMTc0m2WmtZ11CqrDE7VxOZIP2JWr5nhbJdQc4rLL2xsB8APnsxJfp2YQ57TE
Ma7scxhAPpUl0DIfBcLe19toyYfpm3QnO4qTRHvpItdEAI7kl7KciY-atDSENblKs1&t=63476391895
0917029#1)
4

0 回答 0