0

我发现以下代码适用于我的一小部分数据,但我没有意识到我没有采取任何带有多个注释的样本。当我尝试将代码应用于每个条目有多个评论的实际数据库时,我收到了上述错误。

当前代码:

for $doc in doc('test')
let $results :=
(
  let $pKeywords := ('best clients', 'Very', '20')
  return
    for $kw in $pKeywords
    return
    (
      $doc/set/entry[contains(comment, concat('!', $kw))],
      $doc/set/entry[contains(comment, $kw)]
    )
  [not(position() gt 2)]
)
for $i in (1 to count($results))
return
(
  subsequence($results/comment, $i, 1),
  subsequence($results/buyer, $i, 1)
)

文档:

<set>
  <entry>
    <comment>The client is only 20 years old.  Do not be surprised by his youth.</comment>
    <buyer></buyer>
    <id>1282</id>
    <industry>International Trade; Fish and Game</industry>
  </entry>
  <entry>
    <comment>!On leave in October.</comment>
    <comment>!Planning to make a large purchase before Christmas.</comment>
    <buyer></buyer>
    <id>709</id>
    <industry>Real Estate</industry>
  </entry>
    <entry>
    <comment>Is often !out between 1 and 3 p.m.</comment>
    <buyer></buyer>
    <id>127</id>
    <industry>Virus Software Marketting</industry>
  </entry>
  <entry>
    <comment>Very personable.  One of our best clients.</comment>
    <buyer></buyer>
    <id>14851</id>
    <industry>Administrative support.</industry>
  </entry>
  <entry>
    <comment>!Very difficult to reach, but one of our top buyers.</comment>
    <comment>His wife often answers the phone.  That means he is out of the office.</comment>
    <buyer></buyer>
    <id>1458</id>
    <industry>Construction</industry>
  </entry>
  <entry>
    <comment></comment>
    <buyer></buyer>
    <id>276470</id>
    <industry>Bulk Furniture Sales</industry>
  </entry>
  <entry>
    <comment>A bit of an eccentric.  One of our best clients.</comment>
    <buyer></buyer>
    <id>1506</id>
    <industry>Sports Analysis</industry>
  </entry>
  <entry>
    <comment>Very gullible, so please !be sure she needs what you sell her.  She's one of our best clients.</comment>
    <buyer></buyer>
    <id>1523</id>
    <industry>International Trade</industry>
  </entry>
  <entry>
    <comment>He wants to buy everything, but !he has a tight budget.</comment>
    <comment>!His company may be closing soon.</comment>
    <buyer></buyer>
    <id>1524</id>
    <industry>Public Relations</industry>
  </entry>
</set>

结果:

Stopped at line 9, column 22: [XPTY0004] document-node()(...): function(item()*) as item()* expected, document-node() found.

我遇到了类似的错误并且能够修复它,但是当我尝试应用修复时,它不起作用。例子:

  $doc('test')/set/entry[contains(., concat('!', $kw))],
  $doc('test')/set/entry[contains(., $kw)]

返回相同的结果。

遍历期望的结果:

如果条目的子项包含 中的三个关键字中的任何一个,则第一个return应返回everyentry及其子项。comment$pKeywords

concat('!', $kw)应该优先考虑包含 ! 的评论。

第二个从第一个的结果中return分割出commentbuyer节点return

只要comment每个条目中恰好有 1 个命名节点,代码就可以正常执行。当有 2 个或更多comment-named 节点时,代码失败,编译器返回上述错误:

Stopped at line 9, column 22: [XPTY0004] document-node()(...): function(item()*) as item()* expected, document-node() found.

-编辑-

期望的结果:

<comment>The client is only 20 years old.  Do not be surprised by his youth.</comment>
<buyer/>
<comment>Very personable.  One of our best clients.</comment>
<buyer/>
<comment>!Very difficult to reach, but one of our top buyers.</comment>
<buyer/>
<comment>A bit of an eccentric.  One of our best clients.</comment>
<buyer/>

澄清期望的结果:

//contains ! and the first keyword, "best clients"; so, the first result should come from this entry.
  <entry>
    <comment>Very gullible, so please !be sure she needs what you sell her.  She's one of our best clients.</comment>
    <buyer></buyer>
    <id>1523</id>
    <industry>International Trade</industry>
  </entry>

//Only one entry contains ! and "best clients".  So, the first result containing "best clients" contains nodes for the second result.
  <entry>
    <comment>Very personable.  One of our best clients.</comment>
    <buyer></buyer>
    <id>14851</id>
    <industry>Administrative support.</industry>
  </entry>

//This contains ! and the second keyword, "Very", but it is a duplicate.  So, ideally its children should not be returned.
  <entry>
    <comment>!Very difficult to reach, but one of our top buyers.</comment>
    <comment>His wife often answers the phone.  That means he is out of the office.</comment>
    <buyer></buyer>
    <id>1458</id>
    <industry>Construction</industry>
  </entry>

//This contains ! and a string, "very" (part of everything).  Nodes from this entry should be returned as the third result.
  <entry>
    <comment>He wants to buy everything, but !he has a tight budget.</comment>
    <comment>!His company may be closing soon.</comment>
    <buyer></buyer>
    <id>1524</id>
    <industry>Public Relations</industry>
  </entry>

//The only entry whose comment child contains the keyword '20'.  There is no '!'-containing comment with 20, so this nodes is the top and only node whose children should be returned.
  <entry>
    <comment>The client is only 20 years old.  Do not be surprised by his youth.</comment>
    <buyer></buyer>
    <id>1282</id>
    <industry>International Trade; Fish and Game</industry>
  </entry>

-编辑2-

Next pass 让我更好地了解了我要完成的工作,但存在一些明显的语法错误(例如,我仍在探索如何使用数组,如第 8 行所示)。我将在解决语法错误时更新它:

<set>
{
    let $kw := ('best clients', 'Very', '20')
    let $entry := doc('test')/set/entry
    let $priority := '!'

    for $i in (1, count($kw))
    let $priority_result[$i] :=
    (
        for $entries in $entry
        where $entry contains(., $priority) and where $entry contains $kw[$i]
        return subsequence($priority_result[$i], 1, 2)
    )

    if $priority_result[$i] < 2
    for $i in (1, count($kw))
    let $secondary_result[$i] :=
    (
        for $entries in $entry
        where $entry contains $kw[$i] and where $entry not($priority_result) and where $entry not($secondary_result[1..($i-1)])
        return $secondary_result[$i]
    )
    else let $secondary_result[$i] := ''

    for $i in (1, count($kw))
    return
    (
        $primary_result[$i],
        $secondary_result[$i]
    )
}
</set>

以及建议的更改,它返回一个空结果:

for $doc in doc('test')
let $results :=
(
  let $pKeywords := ('best clients', 'Very', '20')
  return
    for $kw in $pKeywords
    return
    (
      $doc/set/entry/comment[contains(., concat('!', $kw))],
      $doc/set/entry/comment[contains(., $kw)]
    )
  [not(position() gt 2)]
)
for $i in (1 to count($results))
return
(
  subsequence($results/comment, $i, 1),
  subsequence($results/buyer, $i, 1)
)
4

2 回答 2

1

错误消息似乎在抱怨试图将 adocument-node()作为函数调用。

$doc('test')对比$doc


要么,要么comments(...)只适用于单个节点,而不是节点集。

contains(comment, $kw)comment/contains(.,$kw)
comment[contains(.,$kw)]
comment[contains(text(),$kw)]


这对我有用:

<set>{
    for $entry in doc('test')/set/entry
    let $kw := (
        for $prefix in ('!','')
        for $kw in ('best clients', 'Very', '20')
        where exists($entry/comment[contains(., concat($prefix,$kw))])
        return concat($prefix,$kw)
    )[1]
    where exists($kw)
    order by not(starts-with($kw,'!'))
    return <entry keyword="{$kw}">{
      ( $entry/comment,
        $entry/buyer )
    }</entry>
}</set>

结果(每条评论多条<entry>):

<set>
   <entry keyword="!Very">
      <comment>!Very difficult to reach, but one of our top buyers.</comment>
      <comment>His wife often answers the phone.  That means he is out of the office.</comment>
      <buyer/>
   </entry>
   <entry keyword="20">
      <comment>The client is only 20 years old.  Do not be surprised by his youth.</comment>
      <buyer/>
   </entry>
   <entry keyword="best clients">
      <comment>Very personable.  One of our best clients.</comment>
      <buyer/>
   </entry>
   <entry keyword="best clients">
      <comment>A bit of an eccentric.  One of our best clients.</comment>
      <buyer/>
   </entry>
   <entry keyword="best clients">
      <comment>Very gullible, so please !be sure she needs what you sell her.  She's one of our best clients.</comment>
      <buyer/>
   </entry>
</set>

这将为您提供每个评论的单独条目:

<set>{
    for $entry in doc('test')/set/entry
    for $comment in $entry/comment
    let $kw := (
        for $prefix in ('!','')
        for $kw in ('best clients', 'Very', '20')
        where exists($comment[contains(., concat($prefix,$kw))])
        return concat($prefix,$kw)
    )[1]
    where exists($kw)
    order by not(starts-with($kw,'!'))
    return <entry keyword="{$kw}">{
      ( $comment,
        $entry/buyer )
    }</entry>
}</set>

输出:

<set>
   <entry keyword="!Very">
      <comment>!Very difficult to reach, but one of our top buyers.</comment>
      <buyer/>
   </entry>
   <entry keyword="20">
      <comment>The client is only 20 years old.  Do not be surprised by his youth.</comment>
      <buyer/>
   </entry>
   <entry keyword="best clients">
      <comment>Very personable.  One of our best clients.</comment>
      <buyer/>
   </entry>
   <entry keyword="best clients">
      <comment>A bit of an eccentric.  One of our best clients.</comment>
      <buyer/>
   </entry>
   <entry keyword="best clients">
      <comment>Very gullible, so please !be sure she needs what you sell her.  She's one of our best clients.</comment>
      <buyer/>
   </entry>
</set>
于 2012-10-01T05:50:09.150 回答
0

作为参考,这是我们开始的代码(有点吓人,我还是不明白):

for $doc in doc('test')
let $results :=
(
  let $pKeywords := ('best clients', 'Very', '20')
  return
    for $kw in $pKeywords
    return
    (
      $doc/set/entry[contains(comment, concat('!', $kw))],  (: *1 :)
      $doc/set/entry[contains(comment, $kw)]                (: *1 :)
    )
  [not(position() gt 2)]
)
for $i in (1 to count($results))
return
(
  subsequence($results/comment, $i, 1), (: *2 :)
  subsequence($results/buyer, $i, 1)    (: *2 :)
)

不抛出错误的版本以典型方式解决。我花了一段时间才发现第二个错误,标记为*2. 基本上,因为我在搜索中要更深一层*1,所以我需要为我的结果再上一层..

for $doc in doc('test')
let $results :=
(
  let $pKeywords := ('best clients', 'Very', '20')
  return
    for $kw in $pKeywords
    return
    (
      $doc/set/entry/comment[contains(., concat('!', $kw))], (: *1, went deeper :)
      $doc/set/entry/comment[contains(., $kw)]               (: *1, went deeper :)
    )
  [not(position() gt 2)]
)
for $i in (1 to count($results))
return
(
  subsequence($results/../comment, $i, 1), (: *2, added .. :)
  subsequence($results/../buyer, $i, 1)    (: *2, added .. :)
)

我仍在努力解决的问题:

1) 的使用concat()。我的理解是,它将两件事放在一起,其结果 for$kw[1]将等同于"!best clients". 然而,结果并没有表明这一点。结果,感叹号并不总是直接位于优先级查询之前。

2) 不返回重复结果。我希望每个条目都是独一无二的。我需要在某处添加一个例程,该例程要么限制重复项进入我的结果集,要么消除之前的重复项[not(position() gt 2)],其中结果的数量被修剪/切片。

感谢所有的观众和作品中的努力!仍然期待更好的答案!

于 2012-10-02T11:31:52.683 回答