0

我正在使用 Spring WebClient 来检索Forbes Global 2000。它工作正常,但现在他们在尝试访问https://www.forbes.com/forbesapi/org/global2000/2020/position/true.json?limit=2000上的 JSON 数据时添加了一个 cookie 同意模式

从技术上讲,这是我可以从 Chrome 看到的网络流:

  1. 获取https://www.forbes.com/forbesapi/org/global2000/2020/position/true.json?limit=2000

    curl 'https://www.forbes.com/forbesapi/org/global2000/2020/position/true.json?limit=2000'
    -H '权威:www.forbes.com'
    -H 'pragma: no-cache'
    -H '缓存控制:无缓存'
    -H 'dnt:1'
    -H '升级不安全请求:1'
    -H '用户代理:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML , 像 Gecko) Chrome/83.0.4103.61 Safari/537.36'
    -H' 接受: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng, / ;q=0.8, application/signed-exchange;v=b3;q=0.9'
    -H 'sec-fetch-site: none'
    -H 'sec-fetch-mode: navigate'
    -H 'sec-fetch-user: ?1'
    -H 'sec-fetch-dest:文档'
    -H '接受语言:it-IT,it;q=0.9,en-GB;q=0.8,en;q=0.7,ru-RU;q=0.6,ru;q=0.5,en-US;q =0.4' --
    压缩

  2. 302 重定向到https://www.forbes.com/consent/?toURL=https://www.forbes.com/forbesapi/org/global2000/2020/position/true.json?limit=2000

在此处输入图像描述

  1. 这是同意模式 在此处输入图像描述
  2. 单击接受(Accetto)时发布到https://consent-pref.trustarc.com/defaultpreferencemanager/truste
  3. 在网络中,我看不到更多重定向,但提供了初始 JSON: 在此处输入图像描述

有什么方法可以使用 Spring WebClient 跳过/绕过/接受这样的同意吗?卷曲或邮递员的例子也足够了。

更新

这是返回 JSON 的 GET。它有一些饼干,所以也许这样它可以工作。但是,我需要自动化它。

curl 'https://www.forbes.com/forbesapi/org/global2000/2020/position/true.json?limit=2000' \
  -H 'authority: www.forbes.com' \
  -H 'pragma: no-cache' \
  -H 'cache-control: no-cache' \
  -H 'dnt: 1' \
  -H 'upgrade-insecure-requests: 1' \
  -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36' \
  -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
  -H 'sec-fetch-site: none' \
  -H 'sec-fetch-mode: navigate' \
  -H 'sec-fetch-user: ?1' \
  -H 'sec-fetch-dest: document' \
  -H 'accept-language: it-IT,it;q=0.9,en-GB;q=0.8,en;q=0.7,ru-RU;q=0.6,ru;q=0.5,en-US;q=0.4' \
  -H 'cookie: notice_behavior=expressed,eu; notice_gdpr_prefs=0,1,2:1a8b5228dd7ff0717196863a5d28ce6c; notice_preferences=2:1a8b5228dd7ff0717196863a5d28ce6c; client_id=8ec8eb2817c5d8913fe8f17384e17644a7e; global_ad_params=%7B%7D; __aaxsc=0; _ga=GA1.2.1634735456.1605525570; _gid=GA1.2.1103122519.1605525570; _fbp=fb.1.1605525570288.36176089; crdl_forbes-livebID=aa58b21d-aa96-4ee0-8340-fb48af0c1a94; crdl_forbes-liveaID=anonimIcgdCVIkNG7BUNba5etK; rbzid=l/U6Jw1r1yWT74cdJuQgosf28tekVPrGz9xIBrFnVQIm5sDTTvTWn/1+r8KTgAdbwgwm63USRoqfI+JwWc3BgQoGTMvyrKX2VRjibHRVHjUSGWqZKX+XhkPhAl2YVYGXbbgLmximPCVub7cvx/UClbnBmD33n1qHOtWLY9urqrdhtD5OGF4BLBZ0L5m0YGx1EKQb+AkPBgt9hT3pryTWI4S84CzcPLIsJTmCtn7YRLgplAvwGe1EA/W8a90o6dER+bVJ+Sy/8/dWswABzRNVwQ==; rbzsessionid=d40c35ba621223231fd16c42b943f00a; __pnahc=0; __tbc=%7Bjzx%7DIFcj-ZhxuNCMjI4-mDfH1NUCZW7-CmMn43P4peooi3pN7s1U6SLs77TJd0_MaIDk7egQVHeoSjUK2k3sjldowD8Ff5A-5DxOArBhtCY3kQmc3ozP8vUOOShHwL0JdQDA3E9l5ifC0NcYbET3aSZxuA; __pat=-18000000; _cb_ls=1; _cb=BLfE4GBNSwJsLxiQ6; AMP_TOKEN=%24NOT_FOUND; __qca=P0-1753552556-1605535506773; mnet_session_depth=2%7C1605535504670; aasd=2%7C1605535505885; __pvi=%7B%22id%22%3A%22v-2020-11-16-15-05-06-660-0XYRWHdQ2XYGbaoP-46b9447a9edebf759b8530b76e3d77fb%22%2C%22domain%22%3A%22.forbes.com%22%2C%22time%22%3A1605535556929%7D; xbc=%7Bjzx%7DTSec02msUmzkbggY3uSB4x2gGnm1UBtgCEjruTUCsjodOn-ho2bBitYYAE8Wj1i1AsluUTc70wVCtGSzun5713YJ_vNU7uygAa3M3ny-XzzOp_CsCgfU3k8e_J8gocbcboz88mUS39sdnXYxb82rI1XecPrFioPfHcuQebQwPSy3ueG7x_j8wu6BCd9t0FTHUw-9eIvfWF1WWnXIvcJfs3MZ5QDIUBA9qEt2a7BQs_lhesgShrkv-PkBiPzR60tXNA8DkrlHZF_puETTlo-Xiw; _chartbeat2=.1594934728317.1605535561012.0000000000000001.BFC6ndInmcUBhHH5EC_OdRH7Niui.1; QSI_HistorySession=https%3A%2F%2Fwww.forbes.com%2Fglobal2000%2F%235197f5e1335d~1605525574002%7Chttps%3A%2F%2Fwww.forbes.com%2Fglobal2000%2F%234f485ec3335d~1605525601420%7Chttps%3A%2F%2Fwww.forbes.com%2Fglobal2000%2F%2367f91b9335d8~1605535512891%7Chttps%3A%2F%2Fwww.forbes.com%2Fglobal2000%2F%234c10589f335d~1605535561525' \
  --compressed
4

1 回答 1

0

解决方案是在请求中提供一些额外的 cookie。使其工作的最小 cookie 集如下:

curl 'https://www.forbes.com/forbesapi/org/global2000/2020/position/true.json?limit=2000'
-H 'cookie: notice_behavior=expressed,eu; notice_gdpr_prefs=0,1,2:1a8b5228dd7ff0717196863a5d28ce6c;'   --compressed

由于 WebClient 抱怨无效字符(: 和 ,),我最终得到了以下最小 cookie 集:

@Bean
public WebClient forbesWebClient() {
    return WebClient.builder()
            .baseUrl(BASE_URL)
            .defaultCookies(cookies -> {
                cookies.add("notice_behavior", "expressed");
                cookies.add("notice_gdpr_prefs", "0");
            })
            .codecs(configurer -> configurer.defaultCodecs().maxInMemorySize(MB5))
            .build();
}
于 2020-11-16T15:07:52.767 回答