elasticsearch bulk資料--ES批量匯入json資料

後開啟撒打發了發表於2017-11-22

一、Bulk API
官網給出的介紹:https://www.elastic.co/guide/en/elasticsearch/reference/6.0/docs-bulk.html

The REST API endpoint is /_bulk, and it expects the following newline delimited JSON (NDJSON) structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

也就是說每一個操作都有2行資料組成,末尾要回車換行。第一行用來說明操作命令和原資料、第二行是自定義的選項.舉個例子,同時執行插入2條資料、刪除一條資料。

{ "create" : { "_index" : "blog", "_type" : "article", "_id" : "3" }}
{ "title":"title1","posttime":"2016-07-02","content":"內容一" }

{ "create" : { "_index" : "blog", "_type" : "article", "_id" : "4" }}
{ "title":"title2","posttime":"2016-07-03","content":"內容2" }

{ "delete":{"_index" : "blog", "_type" : "article", "_id" : "1" }}

官網的解釋和例子:
Because this format uses literal \n's as delimiters, please be sure that the JSON actions and sources are not pretty printed. Here is an example of a correct sequence of bulk commands:

POST _bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }
{ "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }

二、把資料儲存在檔案中的提交方法。 官網的介紹和說明:
If you’re providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn’t preserve newlines. Example:

$ cat requests
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"

具體例子: 把下面的資料儲存在檔案request中,然後使用命令提交:

vim retuqest
curl  -XPOST  '192.168.0.153:9200/_bulk'   --data-binary  @request
{ "index" : { "_index" : "test_index", "_type" : "chen", "_id" : "1" } }
{ "field1" : "value1" }
{ "index" : { "_index" : "test_index", "_type" : "chen", "_id" : "2" } }
{ "field1" : "value2" }
{ "index" : { "_index" : "test_index", "_type" : "chen", "_id" : "3" } }
{ "field1" : "value3" }

看看有沒有提交成功:

curl -XGET 'http://192.168.0.153:9200/test_index/chen/1?pretty'
{
  "_index" : "test_index",
  "_type" : "chen",
  "_id" : "1",
  "_version" : 2,
  "found" : true,
  "_source" : {
    "field1" : "value1"
  }
}

ok,提交成功。

相關文章