2

使用 C# WebRequest,我正在尝试使用 ASP.NET 表单身份验证对网站进行屏幕抓取。

首先,应用程序对登录页面执行 GET,并从隐藏的输入字段中提取 __VIEWSTATE 和 __EVENTVALIDATION 键,并从其 cookie 中提取 .NET SessionId。接下来,应用程序使用用户名、密码、其他必需的表单字段和前面提到的三个 .NET 变量对表单操作执行 POST。

从使用 Chrome 对网站进行身份验证的 Fiddler 会话中,我期望 302 带有存储在 cookie 中的令牌,以允许导航网站的安全区域。我不明白为什么我在没有令牌的情况下不断收到 302,将我重定向到网站的未经身份验证的主页。在 Fiddler 中,我的应用程序的请求看起来与在 Chrome 或 Firefox 中发出的请求完全相同。

        // Create a request using a URL that can receive a post. 
        var request = (HttpWebRequest)WebRequest.Create(LoginUrl);
        // Set the Method property of the request to POST.
        _container = new CookieContainer();

        request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.17 Safari/537.36";
        request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
        request.Headers["Accept-Encoding"] = "gzip,deflate,sdch";
        request.Headers["Accept-Language"] = "en-US,en;q=0.8";

        var response = (HttpWebResponse)request.GetResponse();
        _container.Add(response.Cookies);

        string responseFromServer;

        using (var decompress = new GZipStream(response.GetResponseStream(), CompressionMode.Decompress))
        {
            using (var reader = new StreamReader(decompress))
            {
                // Read the content.
                responseFromServer = reader.ReadToEnd();
            }
        }

        var doc = new HtmlDocument();
        doc.LoadHtml(responseFromServer);

        var hiddenFields = doc.DocumentNode.SelectNodes("//input[@type='hidden']").ToDictionary(input => input.GetAttributeValue("name", ""), input => input.GetAttributeValue("value", ""));

        request = (HttpWebRequest)WebRequest.Create(LoginUrl);

        request.Method = "POST";
        request.CookieContainer = _container;

        // Create POST data and convert it to a byte array.  Modify this line accordingly
        var postData = String.Format("ddlsubsciribers={0}&memberfname={1}&memberpwd={2}&chkRemberMe=true&Imgbtn=LOGIN&__EVENTTARGET&__EVENTARGUMENT&__LASTFOCUS", Agency, Username, Password);
        postData = hiddenFields.Aggregate(postData, (current, field) => current + ("&" + field.Key + "=" + field.Value));

        ServicePointManager.ServerCertificateValidationCallback = AcceptAllCertifications;

        var byteArray = Encoding.UTF8.GetBytes(postData);
        //request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.17 Safari/537.36";
        // Set the ContentType property of the WebRequest.
        request.ContentType = "application/x-www-form-urlencoded";
        request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
        request.Headers["Accept-Encoding"] = "gzip,deflate,sdch";
        request.Headers["Accept-Language"] = "en-US,en;q=0.8";
        // Set the ContentLength property of the WebRequest.
        request.ContentLength = byteArray.Length;
        // Get the request stream.
        var dataStream = request.GetRequestStream();
        // Write the data to the request stream.
        dataStream.Write(byteArray, 0, byteArray.Length);
        // Close the Stream object.
        dataStream.Close();
        // Get the response.
        response = (HttpWebResponse)request.GetResponse();
        _container.Add(response.Cookies);

        // Clean up the streams.
        dataStream.Close();
        response.Close();
4

1 回答 1

0

事实证明,__EVENTVALIDATION 变量中的一些时髦字符被编码为换行符,然后 ASP.NET 假设会话已损坏,将其丢弃。解决方案是使用 Uri.EscapeDataString 转义 ASP.NET 变量。

postData = hiddenFields.Aggregate(postData, (current, field) => current + ("&" + field.Key + "=" + Uri.EscapeDataString(field.Value)));
于 2013-09-16T06:32:22.040 回答