xpath - Xidel：如何从许多相同的值/类中仅选择 1 个并从结果中删除不需要的元素？

Question

xidel -se '//strong[@class="n-heading"][1]/text()[1]' 'https://www.anekalogam.co.id/id'

将打印出 3 个相同的输出

15 June 2020 
                     
15 June 2020 
                     
15 June 2020

那么，我应该怎么做才能只选择其中之一呢？

编辑：

在强类中，值如下所示：

 15 June 2020 
                    &nbsp;

如何仅打印“2020 年 6 月 15 日”？

score 1 · Accepted Answer

让我用下面的例子来说明为什么会发生这种情况。

“测试.htm”：

<html>
  <body>
    <div>
      <span>test1</span>
      <span>test2</span>
      <span>test3</span>
    </div>
    <div>
      <span>test4</span>
    </div>
    <div>
      <span>test5</span>
    </div>
    <div>
      <span>test6</span>
    </div>
  </body>
</html>

xidel -s test.htm -e '//div[1]/span[1]'
test1

xidel -s test.htm -e '//span[1]'
test1
test4
test5
test6

xidel -s test.htm -e '(//span)[1]'
test1

换句话说，您必须将“强”节点放在括号之间：

xidel -s https://www.anekalogam.co.id/id -e '(//strong[@class="n-heading"])[1]/text()[1]'

如果您改为获取父节点，则不需要这样做：

xidel -s https://www.anekalogam.co.id/id -e '//p[@class="n-smaller ngc-intro"]/strong/text()[1]'

[奖金]

您可能已经注意到所需的文本节点跨越 2 行并以 . 仅xidel返回“ 2020 年 6 月 15 日”：

xidel -s https://www.anekalogam.co.id/id -e '//p[@class="n-smaller ngc-intro"]/strong/normalize-space(substring-before(text(),x:cps(160)))'

-x:cps()是codepoints-to-string()(and string-to-codepoints()) 的简写，160 是“No-Break Space”的代码点。
-text()[1]不需要，因为每当您将序列提供给需要字符串的过滤器时，只会使用该序列的第一项。

xpath - Xidel：如何从许多相同的值/类中仅选择 1 个并从结果中删除不需要的元素？

1 回答 1

Related

Reference