4

使用 DOMXPath::query 是否有可能只获得一层深度的子节点?

例如,如果我有这样的文件:

<div>
    <span>
        <cite>
        </cite>
    </span>
    <span>
        <cite>
        </cite>
    </span>
</div>

我希望 NodeList 只包含跨度而不是引用。

还应该提到它并不总是相同的元素(div、span 等)。我需要它来处理任何类型的元素。

这是我尝试过的,但似乎没有用:

//*[not(ancestor::div)]
4

2 回答 2

3

如果你使用

/div/*

然后你会得到这个元素中所有直接子元素的列表,但这些子元素包含它们的子元素。我认为你不能删除孩子的孩子

有使用默认轴,它被称为child::. 该轴只返回当前节点下1层的元素

*匹配所有元素,但既不匹配属性也不匹配 text()

您必须指定节点的路径并小心,//node因为这意味着descendant::node它会返回此树中此名称的所有节点

于 2010-01-01T00:43:40.720 回答
2

Your question is a bit under-specified, so there are several ways to interpret it. If you want all direct child elements of the current element (with all of their sub-elements), then use

*/*

For your example, this gives you

<span>
    <cite>
    </cite>
</span>

and

<span>
    <cite>
    </cite>
</span>

If you want all child nodes, then use node() instead of *:

*/node()

For your example, this gives you both sub-elements as above, alongside with newline/indentation text() nodes.

If, however, you want to have only the child nodes and not their children as well (i.e. only the span elements, but without their child elements), you must use two expressions:

  1. select the direct child elements via */*
  2. process the those child elements and select only the text nodes and not the grandchildren elements via text()

My PHP is a bit rusty, but it should work a bit like this:

$doc = new DOMDocument;
// set up $doc
$xpath = new DOMXPath($doc);

// perform step #1
$childElements = $xpath->query('*/*');

$directChildren = array();
foreach ($childElements as $child) {
  // perform step #2
  $textChildren = $xpath->query('text()', $child);
  foreach ($textChildren as $text) {
    $directChildren[] = $text;
  }
}
// now, $directChildren contains all text nodes
于 2010-01-01T14:52:25.150 回答