json - ElasticSearch JSON file import (Bulk API)

Question

I saw a few similar posts to this here on StackOverflow, but I still don't have a clear understanding of how to index a large file with JSON documents into ElasticSearch; I'm getting errors like the following:

{"error":"ActionRequestValidationException[Validation Failed: 1: index is missing;2: type is missing;]","status":400}

{"took":231,"errors":false,"items":[{"index":{"_index":"test","_type":"type1","_id":"1","_version":7,"status":200}}]

I have a JSON file that is about 2Gb in size, which is the file I actually want to import. But first, in order to understand how the Bulk API works, I created a small file with just a single line of actual data:

testfile.json

{"index":{"_id":"someId"}} \n
{"id":"testing"}\n

I got this from another post on SO. I understand that the first line is a header, and I also understand that the "index" in the first line is the command which is going to be sent to ES; however, this still does not work. Can someone please give me a working example and clear explanation of how to import a JSON file into ES?

Thank you!

score 1 · Accepted Answer

以下示例来自 elasticsearch 文档： https ://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html?q=bulk

{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }
{ "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "index1"} }
{ "doc" : {"field2" : "value2"} }

所以第一行告诉elastic将第二行的文档索引到索引测试中，输入type1和_id 1。它将用field1索引文档。如果它们都转到相同的索引和类型，您可以更改 url。检查示例的链接。

在第三行中，您会看到删除操作的示例，该文档不需要第四行中的文档。

小心非常大的文档，2 Gb 可能太大了。它需要先发送到弹性，然后将其加载到内存中。所以要发送的记录数量是有限制的。

json - ElasticSearch JSON file import (Bulk API)

1 回答 1

Related

Reference