0

我们在 Retrieve 和 Rank 中有一个 Solr 集合,其中包含一个名为document_sub_type的字段。该字段在 Solr 模式中编入索引,但没有字段类型值(我了解旨在由排名器使用的字段必须具有字段类型值“Watson_text_en”;该字段没有)。我们想过滤这个document_sub_type元数据字段的结果。

如果我将查询电力系统客户参考 AND (document_sub_type:"Client Reference*" OR document_sub_type:"Case Study*")发送到 R&R 的/select端点,我只会返回 document_sub_type 值为 "Client Reference Book" 的文档或“客户参考简介”,正如预期的那样。但是,如果我向/fcselect端点发送相同的查询,则返回的文档有一个 document_sub_type 值,该值显然可以包含任何值。

我承认我们的 ranker 没有完全训练,但即使我们从查询中省略 ranker,也会发生这种情况。

为什么 /fcselect 忽略查询的元数据部分?

以下是两个查询的完整响应正文:

从/选择:

{
  "responseHeader": {
    "status": 0,
    "QTime": 2,
    "params": {
      "q": "power systems client reference AND (document_sub_type:\"Client Reference*\" OR document_sub_type:\"Case Study*\")",
      "fl": "document_sub_type",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 89,
    "start": 0,
    "docs": [
      {
        "document_sub_type": "Client Reference Book"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Book"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      }
    ]
  }
}

从 /fcselect:

{
  "responseHeader": {
    "status": 0,
    "QTime": 65,
    "params": {
      "q": "power systems client reference AND (document_sub_type:\"Client Reference*\" OR document_sub_type:\"Case Study*\")",
      "ranker_id": "c852c8x19-rank-422",
      "fl": "document_sub_type",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 39428,
    "start": 0,
    "maxScore": 10,
    "docs": [
      {
        "document_sub_type": "Sales guidance"
      },
      {
        "document_sub_type": "Other sales tool or Utility"
      },
      {
        "document_sub_type": "Client Reference Book"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Book"
      },
      {
        "document_sub_type": "At a Glance"
      },
      {
        "document_sub_type": "Brief or Template for Marketing"
      },
      {
        "document_sub_type": "text/plain"
      },
      {
        "document_sub_type": "Brief or Template for Marketing"
      },
      {
        "document_sub_type": "QRG"
      }
    ]
  }
}
4

1 回答 1

0

/fcselect 端点不支持在查询参数本身中将术语与布尔运算符组合。对于这种类型的操作,您应该能够使用过滤器查询来获得预期的结果。有关详细信息,请参阅此处的文档:https ://www.ibm.com/watson/developercloud/doc/retrieve-rank/plugin_query_syntax.shtml#top

于 2016-09-16T12:24:53.053 回答