我想使用基于实体的 nlu 语义的 solr 构建查询。即使某些实体在 solr 数据库中不作为字段存在,如何从 NLU 结果中受益?我试图过滤结果,但我得到了错误的结果,因为在过滤时,必须按 solr 数据库中存在的字段进行过滤。solr 字段示例:
{ "id":"07Ce2HEBB-PtfltbxUlv", "nom":["Microsoft Surface Pro 4 Type Cover with Fingerprint ID"], "categorie":["Accessoires pour ordinateurs"], "image":["No image"], "marque":["Microsoft"], "version":1665642569640443904}
这是 NLU 输出的一个例子:
{
"intent": {
"name": "Haut-parleurs",
"confidence": 0.9998957514762878
},
"entities": [
{
"entity": "type",
"start": 13,
"end": 21,
"extractor": "DIETClassifier",
"value": "sans fil"
},
{
"entity": "marque",
"start": 22,
"end": 26,
"extractor": "DIETClassifier",
"value": "Sony"
},
{
"entity": "model",
"start": 27,
"end": 37,
"extractor": "DIETClassifier",
"value": "SRSHG1/BLK"
},
{
"entity": "couleur",
"start": 47,
"end": 51,
"extractor": "DIETClassifier",
"value": "noir",
"processors": [
"EntitySynonymMapper"
]
}
],
"intent_ranking": [
{
"name": "Haut-parleurs",
"confidence": 0.9998957514762878
},
{
"name": "greet",
"confidence": 9.423414303455502e-05
},
{
"name": "Casques Bluetooth",
"confidence": 9.48187880567275e-06
},
{
"name": "Boitier lecteur multimedia",
"confidence": 4.859907676291186e-07
},
{
"name": "Bonbons",
"confidence": 1.837529062242993e-08
}
],
"response_selector": {
"default": {
"response": {
"name": null,
"confidence": 0.0
},
"ranking": [],
"full_retrieval_intent": null
}
},
"text": "Haut-parleur sans fil Sony SRSHG1/BLK Hi-Res - Noir anthracite"
}
"text": "Haut-parleur sans fil Sony SRSHG1/BLK Hi-Res - Noir anthracite": 代表搜索'haut-parleur'、'sans fil'、'sony'、'SRSHG1/BLK'的客户查询,和颜色“黑色”。这就是我正在尝试的:
folder_path = 'D:/nlu/'
for filename in glob.glob(os.path.join(folder_path, '*.json')):
with open(filename, 'r') as f:
json_files = json.load(f)
text=json_files['text']
for i in variable:
if(("marque" in entities) and ("categorie" in entities) and ("nom" in entities) and ("image" in entities)):
if(i['entity']=="marque"):
results=solr.search(q="*:*",fq=["marque:"+i['value']])
docs=pd.DataFrame(results.docs)
print(docs)
elif(i["entity"]=="nom"):
results=solr.search(q="*:*",fq=["nom:"+i['value']])
docs=pd.DataFrame(results.docs)
print(docs)
elif(i["entity"]=="categorie"):
results=solr.search(q="*:*",fq=["categorie:"+i['value']])
docs=pd.DataFrame(results.docs)
print(docs)
elif(i["entity"]=="image"):
results=solr.search(q="*:*",fq=["image:"+i['value']])
docs=pd.DataFrame(results.docs)
print(docs)
else:
print("*******")
elif(("marque" in entities) or ("categorie" in entities) or ("nom" in entities)):
if(i['entity']=="marque" and i['entity']!="nom" and i['entity']!="categorie" and i['entity']!="image"):
results=solr.search(q=["marque:"+i['value']])
docs=pd.DataFrame(results.docs)
print(docs)
elif(i["entity"]=="nom" and i['entity']!="marque" and i['entity']!="categorie" and i['entity']!="image"):
results=solr.search(q=["nom:"+i['value']])
docs=pd.DataFrame(results.docs)
print(docs)
elif(i["entity"]=="categorie" and i['entity']!="nom" and i['entity']!="marque" and i['entity']!="image"):
results=solr.search(q=["categorie:"+i['value']])
docs=pd.DataFrame(results.docs)
print(docs)
elif(i["entity"]=="image" and i['entity']!="nom" and i['entity']!="categorie" and i['entity']!="marque"):
results=solr.search(q=["image:"+i['value']])
docs=pd.DataFrame(results.docs)
print(docs)
else:
results=solr.search(q="_text_:"+json_files['text'])
docs=pd.DataFrame(results.docs)
docs=pd.DataFrame(results.docs)
print(docs)
else:
results=solr.search(q="_text_:"+json_files['text'])
docs=pd.DataFrame(results.docs)
docs=pd.DataFrame(results.docs)
print(docs)
我想使用 pysolr 按未在 solr 数据库中存储或索引的字段过滤搜索,输出为:
Empty DataFrame
Columns: []
Index: []
id nom \
0 4bCe2HEBB-PtfltbxUlv [Haut-parleur sans fil Sony SRSHG1/BLK Hi-Res ...
1 47Ce2HEBB-PtfltbxUlv [Mini-cassettes vidéo numériques Sony - DVC ...
2 5LCe2HEBB-PtfltbxUlv [Haut-parleur sans fil SRS-ZR7]
3 8LCe2HEBB-PtfltbxUlv [Facade autoradio CD marin Sony MEXM100BT 160W...
4 8bCe2HEBB-PtfltbxUlv [Haut-parleur portable sans fil Sony SRSXB30/B...
5 8rCe2HEBB-PtfltbxUlv [Mini-système LBT-GPX555 de Sony avec Bluet...
6 2rCe2HEBB-Ptfltb3Gnu [Sony Nh-Aa-B4gn Rechargeable Ni-MH Battery]
7 3rCe2HEBB-PtfltbwkXR [Sony HT-GT1 2.1 Home Theatre System]
categorie \
0 [haut-parleurs Bluetooth et sans fil]
1 [Accessoires appareil photo]
2 [haut-parleurs Bluetooth et sans fil]
3 [Accessoires électroniques pour voiture]
4 [haut-parleurs Bluetooth et sans fil]
5 [Accessoires audio et vidéo]
6 [Cameras & Accessories]
7 [Home Entertainment]
image marque \
0 [No image] [Sony]
1 [No image] [Sony]
2 [No image] [Sony]
3 [No image] [Sony]
4 [No image] [Sony]
5 [No image] [Sony]
6 [http://img6a.flixcart.com/image/rechargeable-... [Sony]
7 [http://img6a.flixcart.com/image/home-theatre-... [Sony]
_version_
0 1665642569914122240
1 1665642569924608000
2 1665642569927753728
3 1665642569950822400
4 1665642569950822401
5 1665642569953968128
6 1665642572737937420
7 1665642573811679246
Empty DataFrame
Columns: []
Index: []
Empty DataFrame
Columns: []
Index: []
但这似乎是错误的,因为客户确实想要: NLU 输出中描述的“Haut-parleur sans fil Sony SRSHG1/BLK Hi-Res - Noir anthracite”