我需要通过解析 python 中的验证码来自动提交注册表单。我注意到目标站点旨在收集用户的详细信息列表,并通过数据回调模式将它们作为 java 脚本中的表单提交,附上下面的片段 [1]。如果我们通过浏览器尝试,它正在提交一个表单并将页面重定向到带有 HTPP 302 代码和确认电子邮件的成功注册页面。如果我尝试通过 python [2](使用 cloudcraper/scrapy)进行操作,它会重定向到成功的注册页面,但既没有注册详细信息,也没有收到确认电子邮件。有人可以帮助指导我如何调用数据回调方法在 python 中执行 java 脚本吗?
[1] HTML代码:
<button class="g-recaptcha" data-callback="submitRuleOptin" data-sitekey="6LfbPnAUAAAAAAACqfb_YCtJi7RY0WkK-1T4b9cUO8">Sign up!</button>
<script>
function submitRuleOptin(token) { document.getElementById("rule-optin-form").submit(); }
<script>
[2] Python代码:
# function is to process raffle entries
def raffle_entry(self,url,suburl,proxy,accounts,auth=None):
try:
log('Entering raffle URL')
i=0
tk= self.get_sitekey(url,proxy)
first_form = self.get_all_forms(url,proxy)[0]
form_details = self.get_form_details(first_form)
buttonvalue = form_details["button"]
selectvalue = form_details["select"]
log("Resolving Captcha")
gresp=self.twocaptcha(tk,url) # Resolving 2Captch here !
data = {}
for individual_list in accounts:
log("Constructing payload for the", individual_list[5])
for input_tag,value_tag in zip(form_details["inputs"],individual_list):
if input_tag["type"] == "hidden":
data[input_tag["name"]] = input_tag["value"]
elif input_tag["type"] != "submit":
data[input_tag["name"]] = value_tag
for button_tag in buttonvalue:
if button_tag["class"] == "g-recaptcha":
data["g-recaptcha-response"]=gresp
data['fields[Raffle.Country]']="GB"
log("Sending payload to the requested URL")
parsedata=json.dumps(data)
url = urljoin(url, form_details["action"])
submit = self.scraper.post(url,data=parsedata,headers=self.json_headers,proxies={"http": proxy, "https": proxy})
if submit.status_code == 200:
log("Your registration was sucessfull !!")
i += 1
else:
log("Failed to register !!")
with codecs.open("output.txt", 'w',encoding='utf-8') as outfile: # writing a text file with the ouput for future reference,
outfile.write(submit.text)
soup = BeautifulSoup(submit.content, "html.parser")
open("submit.html", "w").write(str(soup)) # writing a html file with the ouput for future reference,
return i
except cloudscraper.exceptions.CloudflareChallengeError as err:
pass
except requests.exceptions.RequestException as err:
print(err)