1

我正在使用“未过滤”选项执行cts:search并启用通配符搜索(意味着传递“通配符”)。

在我的数据库中,我插入了 5 个粘贴在下面的 xml 文档。

在下面的 cts:query 中,如果journalTitle元素的值包含通配符 (*),它将返回所有 5 个文档。

例如:“d*”、“di*”、“dixi*”

即使我将“mohi * t”作为journalTitle元素的值传递,我也会得到结果中的所有五个文档。

对于“过滤”选项,它工作正常。

我很好奇为什么会出现这种行为?并且还请让我知道如何为“未过滤”选项更正此问题。

我在谷歌上搜索了很多关于这个但没有找到解决方案。

请在 cts:search 查询和 xml 文件下方找到

cts:查询

cts:search(fn:collection(), cts:element-query(
        xs:QName("root"), 
        cts:and-query(
          (
            cts:element-value-query(xs:QName("sourceType"), "JA", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), 
            cts:element-value-query(xs:QName("journalTitle"), "mohi*t", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), 
            cts:element-value-query(xs:QName("title"), "title1", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), 
            cts:element-value-query(xs:QName("volume"), "volume0", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1)
          ), 
          ()
        ), 
        ()
       ),"unfiltered")

XML 内容 - 粘贴了所有五个 xml:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>Dinesh</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>
-
<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>Dixit</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>
-
<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>Prashant</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>
-
<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>GAYARI</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>
-
<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>KEVAL</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>

您可能需要 xdmp:plan 结果,所以我将其粘贴在下面

xdmp:计划结果:

<qry:query-plan xmlns:qry="http://marklogic.com/cts/query">
    <qry:info-trace>xdmp:eval("xdmp:plan(cts:search(fn:collection(), cts:element-query(&amp;#10;   ...", (), &lt;options xmlns="xdmp:eval"&gt;&lt;database&gt;12874763000056740838&lt;/database&gt;&lt;root&gt;C:\RSuite\modules...&lt;/options&gt;)</qry:info-trace>
    <qry:info-trace>Analyzing path for search: fn:collection()</qry:info-trace>
    <qry:info-trace>Step 1 is searchable: fn:collection()</qry:info-trace>
    <qry:info-trace>Path is fully searchable.</qry:info-trace>
    <qry:info-trace>Gathering constraints.</qry:info-trace>
    <qry:info-trace>Search query contributed 1 constraint: cts:element-query(fn:QName("", "root"), cts:and-query((cts:element-value-query(fn:QName("", "sourceType"), "JA", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), cts:element-value-query(fn:QName("", "journalTitle"), "mohi*t", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), cts:element-value-query(fn:QName("", "title"), "title1", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), cts:element-value-query(fn:QName("", "volume"), "volume0", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1)), ()), ())</qry:info-trace>
    <qry:partial-plan>
        <qry:or-two-queries>
            <qry:element-query>
                <qry:key>10866465315185201428</qry:key>
                <qry:annotation>element(root)</qry:annotation>
                <qry:and-query>
                    <qry:term-query weight="1">
                        <qry:key>15329831187071590131</qry:key>
                        <qry:annotation>element(sourceType,value("JA"))</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="0">
                        <qry:key>3029765743981997321</qry:key>
                        <qry:annotation>element(journalTitle)</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="1">
                        <qry:key>4206353216190327061</qry:key>
                        <qry:annotation>element(title,value("title1"))</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="1">
                        <qry:key>7729558342335907080</qry:key>
                        <qry:annotation>element(volume,value("volume0"))</qry:annotation>
                    </qry:term-query>
                </qry:and-query>
            </qry:element-query>
            <qry:and-two-queries>
                <qry:term-query weight="0">
                    <qry:key>837267169796541076</qry:key>
                    <qry:annotation>link-child(descendant(element(root)))</qry:annotation>
                </qry:term-query>
                <qry:and-query>
                    <qry:term-query weight="1">
                        <qry:key>15329831187071590131</qry:key>
                        <qry:annotation>element(sourceType,value("JA"))</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="0">
                        <qry:key>3029765743981997321</qry:key>
                        <qry:annotation>element(journalTitle)</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="1">
                        <qry:key>4206353216190327061</qry:key>
                        <qry:annotation>element(title,value("title1"))</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="1">
                        <qry:key>7729558342335907080</qry:key>
                        <qry:annotation>element(volume,value("volume0"))</qry:annotation>
                    </qry:term-query>
                </qry:and-query>
            </qry:and-two-queries>
        </qry:or-two-queries>
    </qry:partial-plan>
    <qry:info-trace>Executing search.</qry:info-trace>
    <qry:final-plan>
        <qry:and-query>
            <qry:or-two-queries>
                <qry:element-query>
                    <qry:key>10866465315185201428</qry:key>
                    <qry:annotation>element(root)</qry:annotation>
                    <qry:and-query>
                        <qry:term-query weight="1">
                            <qry:key>15329831187071590131</qry:key>
                            <qry:annotation>element(sourceType,value("JA"))</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="0">
                            <qry:key>3029765743981997321</qry:key>
                            <qry:annotation>element(journalTitle)</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="1">
                            <qry:key>4206353216190327061</qry:key>
                            <qry:annotation>element(title,value("title1"))</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="1">
                            <qry:key>7729558342335907080</qry:key>
                            <qry:annotation>element(volume,value("volume0"))</qry:annotation>
                        </qry:term-query>
                    </qry:and-query>
                </qry:element-query>
                <qry:and-two-queries>
                    <qry:term-query weight="0">
                        <qry:key>837267169796541076</qry:key>
                        <qry:annotation>link-child(descendant(element(root)))</qry:annotation>
                    </qry:term-query>
                    <qry:and-query>
                        <qry:term-query weight="1">
                            <qry:key>15329831187071590131</qry:key>
                            <qry:annotation>element(sourceType,value("JA"))</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="0">
                            <qry:key>3029765743981997321</qry:key>
                            <qry:annotation>element(journalTitle)</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="1">
                            <qry:key>4206353216190327061</qry:key>
                            <qry:annotation>element(title,value("title1"))</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="1">
                            <qry:key>7729558342335907080</qry:key>
                            <qry:annotation>element(volume,value("volume0"))</qry:annotation>
                        </qry:term-query>
                    </qry:and-query>
                </qry:and-two-queries>
            </qry:or-two-queries>
        </qry:and-query>
    </qry:final-plan>
    <qry:info-trace>Selected 5 fragments</qry:info-trace>
    <qry:result estimate="5"/>
</qry:query-plan>

如有语法错误,请见谅。

如果您需要更多信息,请告诉我。

4

2 回答 2

3

启用带有代码点排序规则的单词词典以及三个字符通配符是一个更好的主意。一字符索引和二字符索引非常昂贵。

于 2017-02-10T15:24:47.237 回答
3

通配符搜索依赖于适当的索引或过滤。您是否检查过您是否已启用fast element trailing wildcard searches,也许还检查trailing wildcard searches过您的数据库?这适用于至少有 4 个起始字符的模式。对于三个起始字符,您还需要启用fast element character searches,也许还需要启用three character searches.

MarkLogic 还允许对仅以两个或一个字符开头的模式进行准确的未过滤通配符搜索。一种方法是启用two character searchesandone character searches选项,但根据文档,如果您启用三个字符 one 与单词词典组合,则不需要:

两个字符搜索指定是否应创建索引以启用通配符搜索,其中搜索模式包含两个连续的非通配符(例如 ab*)。如果您有三个字符搜索和一个单词词典,则不需要此索引。

一个字符搜索指定是否应创建索引以启用通配符搜索,其中搜索模式包含单个非通配符(例如,a*)。如果您有三个字符搜索和一个单词词典,则不需要此索引。

(来源:管理 UI 帮助选项卡)

感谢 Dave 指向了解通配符索引,其中详细解释了所有内容。

HTH!

于 2017-02-09T16:35:14.840 回答