3

我在我的 Ruby on Rails 应用程序中使用 Mechanize 和 Nokogiri 来抓取我们的本地打印机管理面板,以检索打印机生命周期中的打印页数。

我有以下 rake 任务:

# Logs into printer admin page and retrieved counts.
require 'rubygems'
require 'mechanize'
require 'logger'

# Create a new mechanize object
agent = Mechanize.new

# Load the printer admin page
page = agent.get("http://192.168.1.126/index.html?lang=1")

# Select the form with an action of index.cqi
form = agent.page.form_with(:action => "index.cgi")
form.radiobuttons_with(:id => '0x3fdb24153404')[1]

# Submit the form
page = form.submit form.buttons.first

pp page

这将返回以下内容:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<script type="text/javascript">
<!--
window.onload=function(){setTimeout(function(){document.menu_link.submit();},0);}
//-->
</script>
</head>
<body>
<form name="menu_link" action="index.html" method="post" enctype="application/x-www-form-urlencoded">
<input type="hidden" name="lang" value="1">
</form>
</body>
</html>

我似乎无法在上面的页面上选择表单,并且脚本似乎停在该页面上并且没有遵循重定向。

是否有处理此类重定向的标准方法?也许暂停脚本直到重定向发生?它会允许重定向工作吗?

任何指针将不胜感激!

4

2 回答 2

1

你有两个选择。任何一个:

  1. 手动提交表格
  2. 使用WatirSelenium

基本上 Mechanise 不会运行 javascript,因此您必须手动模拟 javascript 运行(选项 1)或自动化真正的浏览器来执行此操作(选项 2)

POST如果您只执行 a of而不是 get ,则选项 1 应该是双倍的lang=1,因为这就是表单所做的一切。

我猜是这样的:

page = agent.post('http://192.168.1.126/index.html', {
  "lang" => "1"
})

但我从未真正使用过机械化。

于 2012-04-29T20:44:09.057 回答
0

You should try to add follow on redirects like this

agent.follow_meta_refresh = true

Also if this is javascript controlled behaviour then you are in bad position because mechanize can't follow this. He don't executes js. You will have to see in js how he does it and emulate same call in mechanize.

But i think all you need to do is just

agent.post <url>

because he seems to be expecting post method.

There is hardcore alternative :) to use node-crawler in node.js https://github.com/joshfire/node-crawler it can evaluate javascript from client page server side.

于 2012-04-29T20:53:03.387 回答