marklogic - Marklogic：搜索：建议短语

Question

我有以下 xml 结构：

<root>
<text>Hi i am a test user and doing testing here. Copied text Let’s suppose we have a text field where the user needs to enter the number of a person id. If the user types 1, all ids starting with 1 will show up. If the user types 12, all ids starting with 12 will show up.</text>
</root>

现在我在“文本”元素上创建了字段，并在其上启用了字段词词典。执行以下查询：

xquery version "1.0-ml"; 
import module namespace search ="http://marklogic.com/appservices/search" at "/MarkLogic/appservices/search/search.xqy"; 
let $options := 
<search:options xmlns="http://marklogic.com/appservices/search">
 <default-suggestion-source>
    <word collation="http://marklogic.com/collation//S2">
      <field name="text"/>
    </word>
 </default-suggestion-source>
</search:options>
return
search:suggest("tes", $options, 100)

结果，我得到了“test”和“tseting”作为建议，这绝对没问题，但我也想要更多的文本，就像上面的情况一样，我期待“测试用户和做……”和“在这里测试……”。请帮助我。

score 1 · Accepted Answer

要搜索部分短语，请使用不带右引号的左双引号（的语法值）。例如： search:suggest('"and th', $options) "and that" "and this" 结束双引号表示解析器该短语已完成，因此不会生成扩展建议。也与约束一起使用。

search:suggest('constraint:"and th', $options)</search:quotation>

===== 来自http://docs.marklogic.com/search:suggest

score 1 · Accepted Answer

单词词典存储单词标记，这就是为什么您会返回单个单词，而不是短语。对于短语内的匹配，您可以<text>对每个搜索建议条目使用范围索引，concat('*',$term,'*')以便您的 API 调用看起来像这样search:suggest("*tes*", $options, 100)。

但是，由于领先的通配符模式，我认为这会大大减慢您的查询速度，并且它还会返回元素的整个值，而不是从搜索词的位置开始，即：Hi i am a test user and doing testing here. Copied text ...not test user and doing ...。当然，您可以通过编程方式解析出来。

为了获得更好的性能，请考虑使用分块元素范围索引策略。它需要预处理和潜在的大量数据，具体取决于块源的大小，但它会达到您想要的结果并且非常快速和可扩展。Avalon 咨询公司有一篇很棒的博客文章详细描述了如何做到这一点。

marklogic - Marklogic：搜索：建议短语

2 回答 2

Related

Reference