6

我得到了一份应该是最新的员工名单,但它与用 ASP.NET 编写的 Intranet People Finder 不匹配。

由于信息很敏感,我无法访问 People Finder 正在使用的数据库,因此我获取信息的唯一方法是从顶部的顶层黄铜开始刮取结构,然后依次遍历每一层。

每个人都有一个员工编号,然后形成 URL http://intranet/peoplefinder/index.aspx?srn=ABC1234,然后所有向他们报告的人以<a id="gvEmployees_ctl03_lnkFullName" href="index.aspx?srn=ABC4321" target="_self">每个 URL 指示员工编号并提供指向其团队的链接的格式列在下方。

当团队很大时,问题就出现了,因为分页是在 GridView 中使用 URL 实现的,例如<a href="javascript:__doPostBack('gvEmployees','Page$2')">2</a>.

我将如何抓取此页面,捕获 SRN 和其他详细信息以及在 GridView 的所有页面上向该人报告的人,然后遍历每个报告人并执行相同的过程,直到整个列表完成?

结果的示例 HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" >
<head><title>
    People Finder: Name Surname
</title><link rel="stylesheet" href="/path/to/style.css" type="text/css" /><link rel="stylesheet" href="/path/to/anotherStyle.css" type="text/css" />
    <script type="text/javascript" src="/path/to/peoplefinder.js"></script>
</head>
<body>
    <form name="form1" method="post" action="/path/to/index.aspx" id="form1">
<div>
<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
<input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="### ViewState ###" />
</div>

<script type="text/javascript">
<!--
var theForm = document.forms['form1'];
if (!theForm) {
    theForm = document.form1;
}
function __doPostBack(eventTarget, eventArgument) {
    if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
        theForm.__EVENTTARGET.value = eventTarget;
        theForm.__EVENTARGUMENT.value = eventArgument;
        theForm.submit();
    }
}
// -->
</script>


<script src="/path/to/WebResource.axd?d=AueXWrgAf8xSxMTAt1Q4AA2&amp;t=633311832634916698" type="text/javascript"></script>

        <div class="HP3CHeader">
            <div id="LWHPBanner">
                <h1><span id="lblName">Name Surname</span></h1>
            </div>
        </div>

        <div id='CPMain'>
            <div id="mainBox">

            <div id="pnlEmployeeDetails">

                <div id='basicData'>
                    <img id="imgPhoto" class="photo" src="/path/to/photo.jpg" style="height:69px;width:69px;border-width:0px;" />
                    <span id="lblBusinessUnit">Business Unit</span>
                    <span id="lblCostCentreName">Cost Centre</span>
                    <span id="lblLocation">Location</span>

                    <a href='/path/to/checkcontactdetails.htm' target='_blank' onclick='return OpenCheckContactDetails();' >Find out how to change your details/photo.</a>
                    <div id="manager">
        <strong>Reports to: </strong><a id="hlManager" href="/path/to/index.aspx?srn=ABC1234">Name Surname</a>
    </div>
                </div>

                <div id='contactData'>

                    <div id="pnlSrn">
        <strong>Staff number:</strong> <span id="lblSrn">ABC1234</span>
    </div>


                    <div id="pnlEmailAddress">
        <strong>Email Address:</strong> <span id="lblEmailAddress">Email</span>
    </div>
                    <div style="clear: both"></div>
                </div>

</div>

            <div id="pnlGrid">

                <h3><span id="lblGridTitle">Name's team</span></h3>
            <div>
        <table class="subordinates" cellspacing="0" cellpadding="2" rules="cols" border="1" id="gvEmployees" style="border-style:None;border-collapse:collapse;">
            <tr style="color:Black;background-color:#EFF3FB;border-style:None;font-weight:bold;">
                <th scope="col"><a href="javascript:__doPostBack('gvEmployees','Sort$SRN')" style="color:Black;">SRN</a></th><th scope="col"><a href="javascript:__doPostBack('gvEmployees','Sort$FullName')" style="color:Black;">Full name</a></th><th scope="col"><a href="javascript:__doPostBack('gvEmployees','Sort$RACFID')" style="color:Black;">RACFID</a></th>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl02_lnkFullName" href="index.aspx?srn=1K5932" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl03_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl04_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl05_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl06_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl07_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl08_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl09_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl10_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl11_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="PagerStyle" style="color:#000039;border-style:None;">
                <td colspan="3"><table border="0">
                    <tr>
                        <td><span>1</span></td><td><a href="javascript:__doPostBack('gvEmployees','Page$2')" style="color:#000039;">2</a></td>
                    </tr>
                </table></td>
            </tr>
        </table>
    </div>

</div>
            </div>

            <div id="searchBox">
                <strong>Search People Finder:</strong>
                <br /><br />
                <span>Forename:</span><br/>
                <span><input name="txtFirstname" type="text" id="txtFirstname" /></span><br/>
                <span>Surname:</span><br/>
                <span><input name="txtSurname" type="text" id="txtSurname" /></span><br/>
                <span>RACFID:</span><br/>
                <span><input name="txtRacfid" type="text" id="txtRacfid" /></span><br/>
                <span>Staff number:</span><br/>
                <span><input name="txtSrn" type="text" id="txtSrn" /></span><br/>
                <div class="searchBoxItem" style="text-align:center;width:100%"><input type="submit" name="btnFind" value="Search" onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&quot;btnFind&quot;, &quot;&quot;, false, &quot;&quot;, &quot;index.aspx&quot;, false, false))" id="btnFind" title="Search for employees member" class="button" style="border-style:Outset;" /></div><br/> 
                <div>People Finder searches only UK staff.</div> 
               <!-- <div><a class="execBoardLink" href="/path/to/index.aspx?srn=ABC1234">Show Executive Board</a></div> -->
                <div style="margin-top:5px;"><a href="/path/to/phonebook" target="phoneBook" onclick='return OpenPhonebook();' title="Open Phonebook in new window">Open Phonebook</a></div>
            </div>
        </div>

        <div class="contentFooter"  style="text-align:center;">
            <table width="100%" cellpadding="0" cellspacing="0" border="0" summary="Navigation layout table">
                <tr>
                    <td align="left"><span class="linkArrow">&lt;</span> <a href="javascript:history.back();">Back</a></td>
                    <td align="center"></td>
                    <td align="right"><span class="linkArrow">^ </span><a href="#top">Top</a></td>
                </tr>
            </table>
        </div> 

<div>

    <input type="hidden" name="__PREVIOUSPAGE" id="__PREVIOUSPAGE" value="vy066Txz34y1E515UsTSTDabHKEmdBRCsq7xM0lpJls1" />
    <input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="/wEWCgKM3uTTAgLP/83pDwLfwaTTAQKNguzjCAKt98LeCwLZh62pDwKKqdGpBwLd2q7jAwKa+5aMBAL5zb65C42zY4GBEUKujhjtZ/hZ8sLESfiF" />
</div></form>
</body>
</html>
4

2 回答 2

3

您可以将变量发布到 HTML 页面以进行分页。

string lcUrl = "http://www.mysite.com/page.aspx";

HttpWebRequest loHttp =

   (HttpWebRequest) WebRequest.Create(lcUrl);


// *** Send any POST data

string lcPostData =

   "gvEmployees=" + HttpUtility.UrlEncode("Page$2");

loHttp.Method="POST";

byte [] lbPostBuffer = System.Text.           

                       Encoding.GetEncoding(1252).GetBytes(lcPostData);

loHttp.ContentLength = lbPostBuffer.Length;

Stream loPostData = loHttp.GetRequestStream();

loPostData.Write(lbPostBuffer,0,lbPostBuffer.Length);

loPostData.Close();

HttpWebResponse loWebResponse = (HttpWebResponse) loHttp.GetResponse();

Encoding enc = System.Text.Encoding.GetEncoding(1252);

StreamReader loResponseStream =

   new StreamReader(loWebResponse.GetResponseStream(),enc);

string lcHtml = loResponseStream.ReadToEnd();

loWebResponse.Close();

loResponseStream.Close();

然后从字符串中解析出你需要的数据。

- 编辑 -

这是我将尝试(类似的)发送所有帖子数据的方法:

string lcPostData =

       "__EVENTTARGET" + HttpUtility.UrlEncode("gvEmployees"); &
"__EVENTARGUMENT" + HttpUtility.UrlEncode("Page%242"); &
"__VIEWSTATE" + HttpUtility.UrlEncode("<Value of _Viewstate>");
于 2010-03-15T18:30:08.540 回答
1

您打开提琴手并打开 asp.net 网站表的第二页。转到该特定页面会话的提琴手中的 webforms 选项卡,并在正文中检查正在发布的变量是什么。以相同的序列格式合并所有变量并使用发布数据HttpWeb 请求。就我而言,它是:

string PostData = "__EVENTTARGET=" 
    + HttpUtility.UrlEncode("ctl00$ContentPlaceHolder2$grdDirectory") 
    + "&"
    + "__EVENTARGUMENT="+HttpUtility.UrlEncode("Page$2") 
    + "&"
    + "__VIEWSTATE="+ HttpUtility.UrlEncode(view_state)
    + "&"
    + "__VIEWSTATEGENERATOR=" 
    + HttpUtility.UrlEncode(viewstategenerator)
    + "&"
    + "__VIEWSTATEENCRYPTED=" 
    + HttpUtility.UrlEncode(viewstateencrypted) 
    + "&" 
    + "__EVENTVALIDATION=" + HttpUtility.UrlEncode(eventvalidation);

希望它会起作用。

于 2018-02-01T12:15:05.557 回答