javascript - 为什么 UrlFetchApp.fetch(url).getContentText() 返回不一致的结果？

Question

我正在遍历亚马逊产品列表，以获取列表中每个产品的负面评论。但是，大约 90% 的情况下，UrlFetchApp.fetch("url").getContentText();将返回不包含页面上任何实际内容的 HTML 的简短版本。

现在我强迫循环一次又一次地尝试，直到返回正确的（更长的）HTML，但这需要很长时间。我想知道是否有人知道任何替代方案或任何技巧以使其更加一致？

例如，以下是其中一个 URL：（网页的动态特性与不一致有什么关系吗？）

https://www.amazon.com/product-reviews/B00J2DGTD8/ref=cm_cr_arp_d_viewopt_srt?ie=UTF8&filterByStar=one_star&reviewerType=all_reviews&pageNumber=1&sortBy=recent#reviews-filter-bar

这是一个最小的可重现示例：

function mre() {

  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = ss.getActiveSheet();
  var lastRow = sheet.getLastRow();

  var star = ["one", "two", "three"];

  for(var a = 0; a < 3; a++) { //star 

    htmlRaw = UrlFetchApp
      .fetch(`https://www.amazon.com/product-reviews/B00J2DGTD8/ref=cm_cr_arp_d_viewopt_srt?ie=UTF8&filterByStar=${star[a]}_star&reviewerType=all_reviews&pageNumber=1&sortBy=recent#reviews-filter-bar`).getContentText(); 

    html = htmlRaw.toString();  
  
    //Logger.log(htmlRaw);
    //Logger.log(html);

    //Making sure we have the correct HTML
    //The below if statement acts like a gate. It tries again and again until it receives the correct version of the HTML. Then it will "open the gate" to the else{} below
  
    if(html.substring(100000, 100500) == ``) { //the longer (correct) version of the HTML has 100,000+ characters
      
      Logger.log(`if its empty, then try again: ${html.substring(100000, 100500)}`);
      a--;
    }

    else {

      lastRow = sheet.getLastRow(); //so we actually move the down the rows
      sheet.getRange(lastRow+1, 1).setValue(html.substring(100000, 100500)); //html snippet

    } //else (correct html)
  } //for (a - stars)
} //function mre()

javascript - 为什么 UrlFetchApp.fetch(url).getContentText() 返回不一致的结果？

0 回答 0

Related

Reference