0

如果文档与该文本匹配,我正在尝试用元素替换文本节点,在我尝试过的查询下面但它给出了错误“目标不是元素、文本、属性、注释或 pi”下面是我的查询。

输入XML:

<book>
<p>Isn't it lovely here? Very smart. We'll be like three queens when you've finished with us,
    Edie. You doing well then?</p>
<p>
    <name type="person">April De Angelis</name>’ plays include <title type="work">Positive
        Hour</title> (Out of Joint) <title type="work">Playhouse Creatures</title> (<name
        type="org">Sphinx Theatre Company</name>), <title type="work">Hush</title> (<name
        type="org">Royal Court</name>), <title type="work">Soft Vengeance</title>, <title
        type="work">The Life and Times of Fanny Hill</title> (adapted from the <name type="org"
        >John Cleland novel</name>) and <title type="work">Ironmistress</title>. Her work for
    radio includes <title>The Outlander</title> (<name type="org">Radio 5</name>), which won the
        <name type="org">Writers’ Guild Award</name> (<date>1992</date>), and, for opera, <title
        type="work">Flight</title> with composer <name type="person">Jonathan Dove</name> (<name
        type="place">Glyndebourne</name>, <date>1998</date>).</p>
 </book>

预期输出:

<book>
<p>Isn't it lovely here? Very smart. We'll be like three <highlight>>queens</highlight> when
    you've finished with us, Edie. You doing well then?</p>
<p>
    <name type="person">April De Angelis</name>’ plays <highlight>include</highlight>
    <title type="work">Positive Hour</title> (Out of Joint) <title type="work">Playhouse
        Creatures</title> (<name type="org">Sphinx Theatre Company</name>), <title type="work"
        >Hush</title> (<name type="org">Royal Court</name>), <title type="work">Soft
        Vengeance</title>, <title type="work">The Life and Times of Fanny Hill</title> (adapted
    from the <name type="org">John Cleland novel</name>) and <title type="work"
        >Ironmistress</title>. Her work for radio includes <title>The Outlander</title> (<name
        type="org">Radio 5</name>), which won the <name type="org">Writers’ Guild Award</name>
        (<date>1992</date>), and, for opera, <title type="work">Flight</title> with composer
        <name type="person">Jonathan Dove</name> (<name type="place">Glyndebourne</name>,
        <date>1998</date>).</p>
</book>

我使用的是 BaseX 9.5.1 版本,下面是代码。

let $body := <indexedterms>
        <content>
            <terms>
                <term>include</term>
                <term>Queens</term>
            </terms>
            <uri>/IEEE/IEEE/test.xml</uri>
        </content>
     </indexedterms>

for $contents in $body/content
let $uri := $contents/uri
let $doc := fn:doc($uri)
for $selectedterm in $contents/terms/term/string()
let $Modifieddoc := copy $c := $doc
                    modify
                       (
                          for $nodes in $c//*//text()[fn:matches(.,$selectedterm)]/parent::*
                          return
                          if($nodes/node()[fn:matches(.,$selectedterm)]/parent::*:highlight)
                          then ()
                          else
                          replace node  $nodes/$selectedterm with <highlight>{$selectedterm}</highlight>
                       )
                   return $c
return                       
db:replace('IEEE',substring-after($uri,'/IEEE'),$Modifieddoc)                

以前我使用的是“用 {$selectedterm} 替换节点 $nodes/node()[fn:contains(.,$selectedterm)] ”而不是“用 {$selectedterm} 替换节点 $nodes/$selectedterm”工作,但是像蒸汽这样的术语(包括,包括)所以它匹配两个不正确的词所以我将代码更改为“用{$selectedterm}替换节点“$nodes/$selectedterm”

4

1 回答 1

0

$nodes/$selectedterm可能是罪魁祸首,而且很可能不是您想要的,因为$selectedterm变量是一系列字符串值(您 bind for $selectedterm in $contents/terms/term/string())。doc如果您向我们展示您使用函数加载的示例文档以及您希望使用 BaseX 对其进行的更新,例如,对于term您在代码片段中显示的两个示例,它可能会帮助我们了解您想要实现的目标。

您可以在 XSLT 3 或 3 中很好地完成在文本内容中识别和包装搜索词的任务,如果您将 Saxon 9.9 或 10 或 11 放在类路径上,您可以使用 BaseX 运行它们:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:fn="http://www.w3.org/2005/xpath-functions"
  exclude-result-prefixes="#all"
  expand-text="yes">
  
  <xsl:param name="terms" as="xs:string*" select="'include', 'Queens'"/>

  <xsl:output method="xml" indent="no"/>
  
  <xsl:template match="p//text()">
    <xsl:apply-templates select="analyze-string(., string-join($terms, '|'), 'i')/node()"/>
  </xsl:template>
  
  <xsl:template match="fn:match">
    <highlight>{.}</highlight>
  </xsl:template>
  
  <xsl:template match="fn:non-match">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:mode on-no-match="shallow-copy"/>

</xsl:stylesheet>

由于使用的analyze-string函数也存在于 BaseX/XQuery 中,您还应该能够在调用该函数的结果上使用 XQuery 更新,即用元素替换fn:match元素highlight

于 2022-03-02T07:43:53.060 回答