无论出于何种原因,这种形式document.write
都使用 javascript在页面上绘制单选按钮,我不知道如何规避它。
在这里和谷歌上进行了一些调查之后,您似乎可以使用 Mechanize 手动重新创建表单字段,然后像往常一样提交它。
因此,我将代码设置如下:
br = mechanize.Browser(factory=mechanize.RobustFactory())
response = br.open(url)
br.select_form(nr=0)
br.form.set_all_readonly(False)
br.form.new_control('radio', 'DATASOURCE', {'value':'FILE', 'checked':'true'})
br.form.add_file(open('weather_info.csv'), 'text/csv', 'weather_info.csv', name='FILENAME')
br.form.fixup()
response = br.submit()
现在,如果我打印表单,它确实会显示为底部的字段之一。
<HiddenControl(CGIREF=/calludt.cgi/DDFILE1)>
<HiddenControl(USE=MODEL)>
<HiddenControl(MODEL=CM)>
<HiddenControl(CROP=apples)>
<HiddenControl(METHOD=SS)>
<HiddenControl(UNITS=E)>
<HiddenControl(LOWTHRESHOLD=50)>
<HiddenControl(UPTHRESHOLD=88)>
<HiddenControl(CUTOFF=H)>
<SelectControl(COUNTY=[])>
<CheckboxControl(ACTIVE=[*Y])>
<SelectControl(FROMMONTH=[1, 2, *3, 4, 5, 6, 7, 8, 9, 10, 11, 12])>
<SelectControl(FROMDAY=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, *15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31])>
<SelectControl(FROMYEAR=[2014, *2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970, 1969, 1968, 1967, 1966, 1965, 1964, 1963, 1962, 1961, 1960, 1959, 1958, 1957, 1956, 1955, 1954, 1953, 1952, 1951])>
<SelectControl(THRUMONTH=[1, 2, 3, 4, *5, 6, 7, 8, 9, 10, 11, 12])>
<SelectControl(THRUDAY=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, *12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31])>
<SelectControl(THRUYEAR=[2014, *2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970, 1969, 1968, 1967, 1966, 1965, 1964, 1963, 1962, 1961, 1960, 1959, 1958, 1957, 1956, 1955, 1954, 1953, 1952, 1951])>
<FileControl(FILENAME=weather_scrape.csv)>
<IgnoreControl(Submit=<None>)>
<RadioControl(DATASOURCE=[*FILE])>>
但是,在运行代码并提交表单时,什么也没有发生;它停留在它开始的同一页面上。
有没有办法使用 Python 解决这个问题?
编辑:
我意识到提交按钮也是javascript;它有一个与之onclick
相关的事件,所以..就python解决方案而言,我可能不走运..
<input type="button" name="Submit" value="Continue" onclick="SetDDinfo()">
我尝试手动向表单添加一个按钮,如此处所示。
br.form.new_control('submit', 'Button', {})
但仍然没有运气。
Submit
的onclick
方法调用以下函数:
function SetDDinfo(){
var DDstuff = new Array(21);
var checksok = true;
// alert("SetDDinfo DDparms: " + DDparms);
for (var i = 0; i < 21; i++){
DDstuff[i] = DDparms[i];
}
// This section for MODEL
document.DDCOMPUTE.USE.value = "MODEL";
DDstuff[4] = "MODEL"
with(document.DDCOMPUTE) {
DDstuff[5] = "E"; // english units
DDstuff[6] = ""; // lower threshold (null)
DDstuff[7] = ""; // upper threshold (null)
DDstuff[8] = "SS"; // dd method: (default)
DDstuff[9] = "H"; // cutoff: (default)
DDstuff[10] = "CM"; // organism model code: CM, NOW, etc.
if (COUNTY.selectedIndex < 0) {
DDstuff[11] = "";
} else {
DDstuff[11] = COUNTY.options[COUNTY.selectedIndex].value; // county
}
if (DATASOURCE[0].checked == "1") {
DDstuff[13] = "STATION"; // data from database
DDstuff[14] = "";
} else {
if (DATASOURCE[1].checked == "1") {
DDstuff[13] = "FILE"; // data from user file
DDstuff[14] = FILENAME.value;
} else {
DDstuff[13] = "ENTRY"; // data from user entry
DDstuff[14] = "";
}
}
}
// MODEL error checks. Return false to abort SUBMIT.
if (DDstuff[11] == "" && DDstuff[13] == "STATION") {
alert ("ERROR: You must select a County.");
checksok = false;
return checksok
}
if (DDstuff[10] == "") {
alert ("ERROR: You must select a Model.");
checksok = false;
return checksok
}
if (DDstuff[13] == "FILE" && DDstuff[14] == "") {
alert ("ERROR: You must specify a filename.");
checksok = false;
return checksok
}
// Wx source error checks. Return false to abort SUBMIT.
if (DDstuff[11] == "" && DDstuff[13] == "STATION") {
alert ("ERROR: You must select a County.");
checksok = false;
return checksok
}
if (DDstuff[13] == "FILE" && DDstuff[14] == "") {
alert ("ERROR: You must specify a filename.");
checksok = false;
return checksok
}
DDstuff[14] = ""; // Don't bother saving filename, since not displayed and must be re-specified.
var DDitems = DDstuff.join(",");
SetCookie ('DDinfo', DDitems, exp);
// alert ("DDstuff[13]: "+DDstuff[13]);
if (DDstuff[13] == "STATION") {
document.DDCOMPUTE.action = "/calludt.cgi/DDSTATIONLIST";
document.DDCOMPUTE.method = "GET";
document.DDCOMPUTE.encoding = "application/x-www-form-urlencoded";
} else {
if (DDstuff[13] == "FILE") {
document.DDCOMPUTE.action = "/WEATHER/textupload.cgi";
document.DDCOMPUTE.method = "POST";
document.DDCOMPUTE.encoding = "multipart/form-data";
} else {
document.DDCOMPUTE.action = "/calludt.cgi/DDENTRY1";
document.DDCOMPUTE.method = "GET";
document.DDCOMPUTE.encoding = "application/x-www-form-urlencoded"; }
}
// alert ("document.DDCOMPUTE.action = " + document.DDCOMPUTE.action);
document.DDCOMPUTE.submit()
return checksok
} // ................................................
我不精通 javascript,但该函数的前半部分似乎正在处理我不感兴趣的表单部分 - 当单选按钮在 position 时one
。第二点对我来说很难理解,但它正在发布一个帖子,/WEATHER/textupload.cgi
我明天必须检查流量。也许我可以通过什么来重新创建它需要的urllib2
东西?