dom - Google Apps 脚本是否有类似 getElementById 的东西？

Question

我将使用 Google App Script 从电台网站获取节目列表。如何通过指定元素的id来选择网页中的指定元素？因此，我可以在网页中获取程序。

score 4 · Accepted Answer

编辑，2013 年 12 月： Google 已弃用旧Xml服务，将其替换为XmlService. 此答案中的脚本已更新为使用新服务。新服务需要符合标准的 XML 和 HTML，而旧服务可以容忍缺少关闭标签等问题。

查看教程：解析 XML 文档。（截至 2013 年 12 月，尽管 Xml 服务已弃用，但本教程仍然在线。）从该基础开始，您可以利用脚本服务中的 XML 解析来导航页面。这是一个在您的示例上运行的小脚本：

function getProgrammeList() {
  txt = '<html> <body> <div> <div> <div id="here">hello world!!</div> </div> </div> </html>'

  // Put the receieved xml response into XMLdocument format
  var doc = Xml.parse(txt,true);

  Logger.log(doc.html.body.div.div.div.id +" = "
            +doc.html.body.div.div.div.Text );    /// here = hello world!!

  debugger;  // Pause in debugger - examine content of doc
}

要获取真实页面，请从以下开始：

var url = 'http://blah.blah/whatever?querystring=foobar';
var txt = UrlFetchApp.fetch(url).getContentText();
....

如果您查看文档，getElements您会发现支持检索特定标签，例如“div”。这会找到特定元素的直接子元素，而不是探索整个 XML 文档。您应该能够编写一个函数来遍历检查id每个div元素的文档，直到找到您的程序列表。

var programmeList = findDivById(doc,"here");

编辑-我忍不住...

这是一个实用功能，可以做到这一点。

/**
 * Find a <div> tag with the given id.
 * <pre>
 * Example: getDivById( html, 'tagVal' ) will find
 * 
 *          <div id="tagVal">
 * </pre>
 *
 * @param {Element|Document}
 *                     element     XML document or element to start search at.
 * @param {String}     id      HTML <div> id to find.
 *
 * @return {XmlElement}        First matching element (in doc order) or null.
 */
function getDivById( element, id ) {
  // Call utility function to do the work.
  return getElementByVal( element, 'div', 'id', id );
}

/**
 * !Now updated for XmlService!
 *
 * Traverse the given Xml Document or Element looking for a match.
 * Note: 'class' is stripped during parsing and cannot be used for
 * searching, I don't know why.
 * <pre>
 * Example: getElementByVal( body, 'input', 'value', 'Go' ); will find
 * 
 *          <input type="submit" name="btn" value="Go" id="btn" class="submit buttonGradient" />
 * </pre>
 *
 * @param {Element|Document}
 *                     element     XML document or element to start search at.
 * @param {String}     elementType XML element type, e.g. 'div' for <div>
 * @param {String}     attr        Attribute or Property to compare.
 * @param {String}     val         Search value to locate
 *
 * @return {Element}               First matching element (in doc order) or null.
 */
function getElementByVal( element, elementType, attr, val ) {
  // Get all descendants, in document order
  var descendants = element.getDescendants();
  for (var i =0; i < descendants.length; i++) {
    var elem = descendants[i];
    var type = elem.getType();
    // We'll only examine ELEMENTs
    if (type == XmlService.ContentTypes.ELEMENT) {
      var element = elem.asElement();
      var htmlTag = element.getName();
      if (htmlTag === elementType) {
        if (val === element.getAttribute(attr).getValue()) {
          return element;
        }
      }
    }
  }
  // No matches in document
  return null;
}

将此应用于您的示例，我们得到：

function getProgrammeList() {
  txt = '<html> <body> <div> <div> <div id="here">hello world!!</div> </div> </div> </html>'

  // Get the receieved xml response into an XML document
  var doc = XmlService.parse(txt);

  var found = getDivById(doc.getElement(),'here');
  Logger.log(found.getAttribute(attr).getValue()  
             + " = "
             + found.getValue());    /// here = hello world!!
}

注意：有关使用这些实用程序的实际示例，请参阅此答案。

score 1 · Accepted Answer

我假设您指的是使用 UrlFetchApp 的fetch()方法。在这种情况下，根据您的想法，答案是否定的。

如果您查看fetch() 文档中的返回类型，它会返回HTTPResponse。有一些方法可以做到这一点，但大多数都涉及将返回的数据作为字符串获取。好消息是，您仍然可以使用此处记录的任何（嗯，大多数）传统 JS String 方法- 因此您可以使用search(),match()等。根据您的项目，您可以使用这些方法在回复。

score 1 · Accepted Answer

有人在这里做了一个例子，其中以下自定义函数可用于剪切和粘贴：

getElementById()
getElementsByClassName()
getElementsByTagName()

然后你可以做这样的事情

function doGet() {
  var html = UrlFetchApp.fetch('http://en.wikipedia.org/wiki/Document_Object_Model').getContentText();
  var doc = XmlService.parse(html);
  var html = doc.getRootElement();
  var menu = getElementsByClassName(html, 'menu-classname')[0];
  return menu;
}

dom - Google Apps 脚本是否有类似 getElementById 的东西？

3 回答 3

编辑-我忍不住...

Related

Reference