我有一个机械化python脚本,用于提交表格以查询药物信息。当我运行它时,它没有给我任何错误消息,但是当我查看响应时,它不是我在浏览器视图源页面上看到的。提交后我检查了网址:
这是我得到的:
http://www.accessdata.fda.gov/scripts/cder/drugsatfda/index.cfm
这是我应该得到的地址:
http://www.accessdata.fda.gov/scripts/cder/drugsatfda/index.cfm?fuseaction=Search.DrugDetails
我看到第二个 url 不包含我的查询文本,这是否意味着我需要 cookie?如果是这样,如何?
这是我的代码片段:
br = mechanize.Browser()
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
....
br.addheaders = [('User-agent', 'Mozilla/6.0 (X11; U; i686; en-US; rv:1.9.0.1) Gecko/2008071615 OS X 10.2 Firefox/3.0.1')]
fda_url2 = 'http://www.accessdata.fda.gov/scripts/cder/drugsatfda/index.cfm?fuseaction=Search.Addlsearch_drug_name'
print br.open(fda_url2).geturl()
for f in br.forms():
print 'this is a form'
print f
br.select_form('searchoptionB')
br.form['ApplNo'] = '018780'
html = br.submit(name = 'Search_Button')
print html.geturl()
打印表单输出为:
<searchoptionB POST http://www.accessdata.fda.gov/scripts/cder/drugsatfda/index.cfm application/x-www-form-urlencoded
<HiddenControl(fuseaction=Search.SearchAction) (readonly)>
<HiddenControl(SearchType=AddlSearch) (readonly)>
<HiddenControl(SearchOption=B) (readonly)>
<TextControl(ApplNo=)>
<SubmitControl(Search_Button=Submit) (readonly)>
<SubmitControl(clearcriteria=Clear) (readonly)>>
对不起,很长的帖子;p