11

我使用此代码登录:

CookieCollection cookies = new CookieCollection();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("example.com");
request.CookieContainer = new CookieContainer();
request.CookieContainer.Add(cookies);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
cookies = response.Cookies;

string getUrl = "example.com";
string postData = String.Format("my parameters");
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(getUrl);
getRequest.CookieContainer = new CookieContainer();
getRequest.CookieContainer.Add(cookies);
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0";
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version11;
getRequest.AllowAutoRedirect = true;
getRequest.ContentType = "application/x-www-form-urlencoded";

byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream = getRequest.GetRequestStream();
newStream.Write(byteArray, 0, byteArray.Length);
newStream.Close();

HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream(), Encoding.GetEncoding("windows-1251")))
{
        doc.LoadHtml(sr.ReadToEnd());
        webBrowser1.DocumentText = doc.DocumentNode.OuterHtml;
}

然后我想使用 HtmlWeb (HtmlAgilityPack) 或 Webclient 将 HTML 解析为 HtmlDocument(HtmlAgilityPack)。

我的问题是,当我使用:

WebClient wc = new WebClient();
webBrowser1.DocumentText = wc.DownloadString(site);

或者

doc = web.Load(site);
webBrowser1.DocumentText = doc.DocumentNode.OuterHtml;

登录消失了,所以我想我必须以某种方式通过 cookie.. 有什么建议吗?

4

3 回答 3

21

检查HtmlAgilityPack.HtmlDocument Cookie

这是您正在寻找的示例(语法未经过 100% 测试,我只是修改了一些我通常使用的类)

public class MyWebClient
{
    //The cookies will be here.
    private CookieContainer _cookies = new CookieContainer();

    //In case you need to clear the cookies
    public void ClearCookies() {
        _cookies = new CookieContainer();
    }

    public HtmlDocument GetPage(string url) {
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
        request.Method = "GET";

        //Set more parameters here...
        //...

        //This is the important part.
        request.CookieContainer = _cookies;

        HttpWebResponse response = (HttpWebResponse)request.GetResponse();
        var stream = response.GetResponseStream();

        //When you get the response from the website, the cookies will be stored
        //automatically in "_cookies".

        using (var reader = new StreamReader(stream)) {
            string html = reader.ReadToEnd();
            var doc = new HtmlDocument();
            doc.LoadHtml(html);
            return doc;
        }
    }
}

以下是你如何使用它:

var client = new MyWebClient();
HtmlDocument doc = client.GetPage("http://somepage.com");

//This request will be sent with the cookies obtained from the page
doc = client.GetPage("http://somepage.com/another-page");

注意:如果你也想使用POST方法,只需创建一个GetPagePOST逻辑相似的方法,重构类等。

于 2013-03-05T06:47:26.297 回答
3

这里有一些建议:Using CookieContainer with WebClient class

但是,继续使用HttpWebRequest并在 cookie 中设置 cookie可能更容易CookieContainer

代码看起来像这样:

 // Create a HttpWebRequest
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(getUrl);

// Create the cookie container and add a cookie
request.CookieContainer = new CookieContainer();

// Add all the cookies
foreach (Cookie cookie in response.Cookies)
{
    request.CookieContainer.Add(cookie);
}

第二件事是您不需要再次下载该站点,因为您已经从您的网络响应中获得了它并且您将它保存在这里:

HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream(), Encoding.GetEncoding("windows-1251")))
{
        webBrowser1.DocumentText = doc.DocumentNode.OuterHtml;
}

您应该能够只获取 HTML 并使用 HTML Agility Pack 对其进行解析:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(webBrowser1.DocumentText);

那应该这样做...... :)

于 2013-03-04T17:21:42.197 回答
2

尝试在本地缓存来自先前响应的 cookie,然后在每个 Web 请求中重新发送它们,如下所示:

private CookieCollection cookieCollection;

...

    parserObject = new HtmlWeb
                {
                    AutoDetectEncoding = true,
                    PreRequest = request =>
                    {
                        if (cookieCollection != null)
                            cookieCollection.Cast<Cookie>()
                                .ForEach(cookie => request.CookieContainer.Add(cookie));
                        return true;
                    },
                    PostResponse = (request, response) => { cookieCollection = response.Cookies; }
                };
于 2017-04-05T12:41:51.190 回答