2

当我在文档存储中编写文档时,我正在使用 Haystack 搜索查询,不幸的是,这个错误发生在我身上。这是我的代码:

if __name__ == "__main__":
    document_store = ElasticsearchDocumentStore(
        host='localhost',
        username='', password='',
        index='aurelius'
    )
    df = pd.read_csv('news.csv')
    print(df.columns)
    data_json = [{
        'text': text,
        'meta': {
            'source': 'news'
        }
    } for text in df['Text'].values]
    document_store.write_documents(data_json)
    retriever_elastic = DensePassageRetriever(
        document_store=document_store,
        query_embedding_model='facebook/dpr-question_encoder-single-nq-base',
        passage_embedding_model='facebook/dpr-ctx_encoder-single-nq-base',
        embed_title=True
    )
    document_store.update_embeddings(retriever=retriever_elastic)
    print(retriever_elastic.retrieve("german business confidence slides german business confidence fell in february knocking hopes of a speedy recovery in europe s largest economy. "))
4

1 回答 1

2

基于@UninformedUser 回复。

我假设它是document.store.write_documents(data_json)引发异常的那个。由于参数的格式已从 更改{ 'text': str, 'meta': obj}{'content': str, 'meta': obj}

所以基本上你只需要修复代码的列表理解部分:

data_json = [{
    'content': text,
    'meta': {
        'source': 'news'
    }
} for text in df['Text'].values]
document_store.write_documents(data_json)
于 2022-01-10T11:55:21.327 回答