java - Jsoup 使用表单登录（帖子）

Question

在阅读了一些示例之后，我想实现一个爬虫来帮助登录，例如：

https://target.helpshift.com/login/?next=%2Fadmin%2Fissues%2F

import org.jsoup.Connection;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

public class JsouptTest {

    public static void main(String[] args) throws Exception {
        int x = 1;
        Connection.Response loginForm = Jsoup.connect("https://target.helpshift.com/login/?next=%2Fadmin%2Fissues%2F" + x + "%2F")
                .method(Connection.Method.GET)
                .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0")
                .execute();

        Document document = Jsoup.connect("https://target.helpshift.com/login/")
                .data("cookieexists", "false")
                .data("username", "email@example.com")
                .data("password", "123456")
                .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0")
                .cookies(loginForm.cookies())
                .post();
        System.out.println(document);

    }

}

但是，我收到此错误：

线程“主”org.jsoup.HttpStatusException 中的异常：获取 URL 的 HTTP 错误。状态=403，URL= https://target.helpshift.com/login/ at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:537) at org.jsoup.helper.HttpConnection$Response.execute( HttpConnection.java:493) at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:205) at org.jsoup.helper.HttpConnection.post(HttpConnection.java:200) at edu.utfpr.helpcrawler.JsouptTest.main (JsouptTest.java:32)

score 2 · Accepted Answer

如果您检查请求标头，您将看到它像您一样发送 cookie，但它也在表单数据中包含 cookie 的一部分。将此添加到您的第二个请求中

.data("_csrf_token", loginForm.cookie("_csrf_token"))

java - Jsoup 使用表单登录（帖子）

1 回答 1

Related

Reference