这个 XPath 1.0 表达式准确地选择了想要的节点:
/*/span[.='Heading4']
/following-sibling::text()
[count(.|/*/span[.='Heading5']/preceding-sibling::text())
=
count(/*/span[.='Heading5']/preceding-sibling::text())
]
[normalize-space()]
它是由著名的 Kayessian 方法产生的,用于两个节点集的交集$ns1
和$ns2
:
$ns1[count(.|$ns2) = count($ns2)]
如果在 Kayessian 公式中我们替换为:我们得到上面的第一个表达式$ns1
:
/*/span[.='Heading4']/following-sibling::text()
并$ns2
与:
/*/span[.='Heading5']/preceding-sibling::text()
最后一个谓词[normalize-space()]
从这个交集中过滤掉只有空格的文本节点。
基于 XSLT 的验证:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/span[.='Heading4']
/following-sibling::text()
[count(.|/*/span[.='Heading5']/preceding-sibling::text())
=
count(/*/span[.='Heading5']/preceding-sibling::text())
]
[normalize-space()]
"/>
</xsl:template>
</xsl:stylesheet>
当此转换应用于提供的 XML 文档时(替换了实体——因为我们没有定义它们可用的 DTD,这在这里不是必需的):
<html>
<span>Heading</span>
<br />
<br />
<span>Heading1</span>
<br /> data#1
<br />
<br />
<span>Heading4</span>
<br /> #acirc;#euro;#cent; data#4.1
<br /> #acirc;#euro;#cent; data#4.2
<br /> #acirc;#euro;#cent; data#4.3
<br /> #acirc;#euro;#cent; data#4.4
<br />
<br />
<span>Heading5</span>
<br /> #acirc;#euro;#cent; data#5.1
<br /> #acirc;#euro;#cent; data#5.2
<br /> #acirc;#euro;#cent; data#5.3
<br />
<br />
</html>
计算 Xpath 表达式,并将计算结果复制到输出:
#acirc;#euro;#cent; data#4.1
#acirc;#euro;#cent; data#4.2
#acirc;#euro;#cent; data#4.3
#acirc;#euro;#cent; data#4.4