1

我是新手,这是我的处女航,手头的任务是在 C# 中创建一个事务,该事务将通过 WebRequest/WebResponse 在 Web 应用程序的页面流中导航。我得到了请求/响应机制、cookies 和所有功能(我可以成功地使用 POST URL 和 POST 主体的硬编码值执行事务),困难在于从 WebRequest 的值对为 WebRequest 生成动态 POST 主体和 POST URL . 本质上,一旦流程从第一个 WebRequest 开始,它始终具有相同的静态 URL 和“硬编码”正文,每个后续请求都是从前一个响应的 FORM 值对构建的,例如:响应中的 FORM 的一部分(我已经用方括号替换了 HTML 左括号和右括号,不知道如何将 HTML 直接粘贴到此处):

    <form id="expressform" method="post" action="">
<div>
    <input type="hidden" name="ScreenData.widgets.modified" value=""/><input type="hidden" name="ScreenData.header.hidden.name" value="ScreenData.widgets.modified"/><input type="hidden" name="ScreenData.marshalled" value="true"/><input type="hidden" name="ScreenData.header.hidden.name" value="ScreenData.marshalled"/><input type="hidden" name="isCreateAccountWizard" value="true"/><input type="hidden" name="ScreenData.header.hidden.name" value="isCreateAccountWizard"/>
    <input type="hidden" name="versionPoint" value="77777"/>

然后在表单中的一些文本区域提交值,如下所示:

<tr>
    <td class="dataOut" style="padding-left:30px">
        <textarea name="ScreenData.sicInfo.natureOfBusiness" rows="5"  cols="60" class="dataOut" onmouseup="textAreaCounter(this,250);;" onkeypress="textAreaCounter(this,250);;" onkeyup="textAreaCounter(this,250);;" onchange="markDataDirty(this);;"></textarea> 
    </td>
</tr>

然后在提交上有 URL:

 <a class="detailBtnOn" href="javascript:submitForm('express?displayAction=CreateAccountWizard&amp;saveAction=SaveCreateSICCode&amp;flow=forward&amp;saveActionToken=84454A7D-50FE-5856-CE17-916B70EDFE1A&amp;flowToken=CF3827F4-1DE7-54B1-D87B-D72F01C454C3')">Submit</a>

然后下一个 WebResponse 应该在它的 POST 正文中有这个:

ScreenData.widgets.modified=&ScreenData.header.hidden.name=ScreenData.widgets.modified&ScreenData.marshalled=true&ScreenData.header.hidden.name=ScreenData.marshalled&isCreateAccountWizard=true&ScreenData.header.hidden.name=isCreateAccountWizard&versionPoint=77777&ScreenData.commonHeaderInfo.accountName=SomeAccountName&ScreenData.commonHeaderInfo.effectiveDate=08%2F01%2F2011&ScreenData.sicInfo.natureOfBusiness=business&ScreenData.sicInfo.sic=7777&ScreenData.widgets.modified=ScreenData.sicInfo.natureOfBusiness&ScreenData.widgets.modified=ScreenData.sicInfo.sic

这是一个 URL:

express?displayAction=CreateAccountWizard&saveAction=SaveCreateSICCode&flow=forward&saveActionToken=84454A7D-50FE-5856-CE17-916B70EDFE1A&flowToken=CF3827F4-1DE7-54B1-D87B-D72F01C454C3 

但我不仅不知道如何构建这个解析引擎,甚至无法从 FORM 中获取值对。我正在尝试使用 AgilityPack,这里有一点至少应该打印出 FORMs 的“重要”内容:

var page = new HtmlDocument();
page.OptionReadEncoding = false;
var stream = HttpWResponse.GetResponseStream(); 
page.Load(stream);
foreach (var f in page.DocumentNode.Descendants("form"))
{
    foreach (var d in page.DocumentNode.Descendants("div"))
    {
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info((f.GetAttributeValue("name", null) ?? f.GetAttributeValue("id", "<no name>")) + ": ");
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info(f.GetAttributeValue("method", "<no method>") + ' ');
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info(f.GetAttributeValue("action", "<no action>"));

        foreach(var i in f.Descendants("input"))//{

        {
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info('\t' + (i.GetAttributeValue("name", null) ?? f.GetAttributeValue("id", "<no name>")));
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info(" (");
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info(i.GetAttributeValue("type", "<no type>"));
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info("): " + i.GetAttributeValue("value", "<no value>"));
        }
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info("");
    }
}

但它只打印出这个:

INFO  EventsLogger - 
INFO  EventsLogger - expressform: 
INFO  EventsLogger - 
INFO  EventsLogger - post 

(如果我摆脱了“div”位-foreach(page.DocumentNode.Descendants(“div”)中的var d),-没有任何变化)


任何有关 FORM 打印输出解析器以及如何构建解析引擎以构建来自响应的请求的帮助或建议将不胜感激。

4

1 回答 1

0

check this out Parsing HTML page with HtmlAgilityPack and this http://refactoringaspnet.blogspot.com/2010/04/using-htmlagilitypack-to-get-and-post_19.html and http://htmlagilitypack.codeplex.com/discussions/247206 and How would I get the inputs from a certain form with HtmlAgility Pack? Lang: C#.net

EDIT - some more info:

you loop via foreach over the forms in the HTML document but you go after the DIVs in the next foreach without referencing the current form... in the inner foreach loop(s) you need something similar to

foreach (var d in f.SelectNodes(".//div"))

and

foreach (var i in d.SelectNodes(".//input"))
于 2011-08-05T03:39:45.273 回答