xml - 根据关键字从xml中提取节点

Question

我有一个如下所示的 XML，并试图根据关键字提取节点。尝试使用 XPath 和 XMLLint。但很明显，我做的不对。所以希望在这方面有所帮助。

XML 文件

  <section>
    <h>2 Introduction</h1>
    <region>Intro 1</region>
    <region>Background</region>
  </section>
<article>
 <body>
  <section>
    <h1>2 Task objectives</h1>
    <region>2.1 Primary objectives </region>
    <region>2.</region>
  </section>

  <section>
    <h2>Requirements</h1>
    <region>System Requirements </region>
    <region>Technical Requirements</region>
  </section>

  <section>
    <h3>Design</h1>
    <region>Design methodology </region>
    <region>Design patterns</region>
  </section>
  </body>
</article>

给定这个 XML 和一个关键字Task objectivesor objectives（不区分大小写），我需要提取整个节点并写入另一个 XML 文件

<section>
    <h1>2 Task objectives</h1>
    <region>2.1 Primary objectives </region>
    <region>2.</region>
</section>

我尝试使用 Xpath 和 XMllint 提取的部分。

 $ xmllint --xpath //body//section//h1[.="Task objectives"] Prior.mod.xml
 XPath error : Invalid predicate
//body//section//h1[.=Task objectives]
                  ^
xmlXPathEval: evaluation failed
XPath evaluation failure

谁能让我知道上面有什么问题以及如何解决？另外，我想在文件目录的外壳中执行此操作。XMLlint 是最佳选择吗？

score 2 · Accepted Answer

shell"在命令行解析期间删除引号 ( ) 字符——您需要引用整个表达式，如

xmllint --xpath '//body//section//h1[.="Task objectives"]' Prior.mod.xml

例子：

$ xmllint --xpath //body//section//h1[.="Task objectives"] -
<body>
<section>
<h1>Task objectives</h1>
<h1>abcd</h1>
</section>
</body>
^D

导致：

XPath error : Invalid predicate
//body//section//h1[.=Task objectives]
                           ^
xmlXPathEval: evaluation failed
XPath evaluation failure

注意缺少的引号。然后我尝试了

$ xmllint --xpath '//body//section//h1[.="Task objectives"]' -
<body>
<section>
<h1>Task objectives</h1>
<h1>abcd</h1>
</section>
</body>
^D

产生了输出

<h1>Task objectives</h1>

score 0 · Accepted Answer

这适用于 XPath 1.0：

//section[contains(
  translate(h1, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'),
  'task objectives')
]

xml - 根据关键字从xml中提取节点

2 回答 2

Related

Reference