ES 筆記四:文件的基本 CRUD 與批量操作

CrazyZard發表於2019-10-13
--- --- --- --
Index PUT my_index/_doc/1(ID不存在會建立新的,否則會替換現有文件,版本增加) {"user":"mike,"comment":"You know ,for search "}
Create PUT my_index/_create/1(如果ID已存在會失敗) {"user":"mike,"comment":"You know ,for search "}
Create POST my_index/_doc/1(不指定ID,自動生成) {"user":"mike,"comment":"You know ,for search "}
Read GET my_index/_doc/1 {"user":"mike,"comment":"You know ,for search "}
Update POST my_index/_update/1(文件必須存在,更新只會對相應欄位做增量修改) { "doc": {"user":"mike,"comment":"You know Elasticserch "} }
Delete PUT my_index/_doc/1 {"user":"mike,"comment":"You know ,for search "}

Create

  • 支援自動生成文件Id 和指定文件Id 兩種方式
  • 通過呼叫 "POST /users/_doc"
    • 會自動成功document Id
  • 使用 HTTP PUT /user/_create/1 建立時,URI中顯示指定 _create ,此時如果id存在,操作失敗

    POST users/_doc
    {
      "fristName": "sunke",
      "lastName" : "Lee" ,
      "tags" : ["guitar","ball"]
    }
    //執行結果
    {
      "_index" : "users",
      "_type" : "_doc",
      "_id" : "zX7CxG0BUbmjDJcPW5Fs",
      "_version" : 1,
      "result" : "created",
      "_shards" : {
        "total" : 2,
        "successful" : 2,
        "failed" : 0
      },
      "_seq_no" : 1,
      "_primary_term" : 1
    }
    
    PUT users/_create/1
    {
      "fristName": "sunke",
      "lastName" : "Lee" ,
      "tags" : ["guitar","ball"]
    }
    //第一次執行結果
    {
      "_index" : "users",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "result" : "created",
      "_shards" : {
            "total" : 2,
            "successful" : 1,
            "failed" : 0
      },
      "_seq_no" : 0,
      "_primary_term" : 1
    }
    
    //第二次執行結果
    {
    "error": {
    "root_cause": [
      {
        "type": "version_conflict_engine_exception",
        "reason": "[1]: version conflict, document already exists (current version [1])",
        "index_uuid": "vgfH8pmURlmodwq8Zp0elw",
        "shard": "0",
        "index": "users"
      }
    ],
    "type": "version_conflict_engine_exception",
    "reason": "[1]: version conflict, document already exists (current version [1])",
    "index_uuid": "vgfH8pmURlmodwq8Zp0elw",
    "shard": "0",
    "index": "users"
      },
      "status": 409
    }

    GET

  • 找到文件,返回HTTP 200
    • 文件元資訊
      • _index / _type /
      • 版本資訊,同一個ID的文件,即將被刪除,Version 號也會不斷增加
      • _source 中預設包含了文件的所有資訊
  • 找不到文件,返回HTTP 404
    GET users/_doc/1
    {
      "_index" : "users",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 3,
      "result" : "updated",
      "_shards" : {
        "total" : 2,
        "successful" : 2,
        "failed" : 0
      },
      "_seq_no" : 3,
      "_primary_term" : 1
    }

    Index 文件

    • Index 跟 Create 不一樣的地方 : 如果文件不存在,就索引新的文件、否則現有文件會被刪除,新的文件被索引。版本資訊 + 1
      PUT users/_doc/1
      {
      "tags":["guitar","ball","reading"]
      }
      {
      "_index" : "users",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 6, // 版本
      "result" : "updated",
      "_shards" : {
      "total" : 2,
      "successful" : 2,
      "failed" : 0
      },
      "_seq_no" : 6,
      "_primary_term" : 1
      }

      Update 文件

    • Update 方法不會刪除原來的文件,而是實現真正的資料更新
    • Post 方法 / Payload 需要包含在 “doc” 中
      POST /users/_update/1
      {
        "doc": {
          "albums" : ["Alnum1","Alumb2"]
        }
      }
      //操作完 返回結果
       {
            "_index" : "users",
            "_type" : "_doc",
            "_id" : "1",
            "_version" : 8,
            "result" : "updated",
            "_shards" : {
              "total" : 2,
              "successful" : 2,
              "failed" : 0
            },
            "_seq_no" : 8,
            "_primary_term" : 1
      }
      GET users/_doc/1
      {
        "_index" : "users",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 8,
        "_seq_no" : 8,
        "_primary_term" : 1,
        "found" : true,
        "_source" : {
          "tags" : [
            "guitar",
            "ball",
            "reading"
          ],
          "albums" : [
            "Alnum1",
            "Alumb2",
            "Alumb3"
          ]
        }
      }

BULK API

  • 目的是在一次API呼叫中,對不同的索引進行操作
  • 支援四種型別的操作
    • Index
    • Create
    • Update
    • Delete
  • 可以在URL中指Index,也可以在請求的Payload中進行
  • 操作中單條操作失敗,並不會影響其他操作
  • 返回結果包括每一條操作執行的結果

    POST _bulk
    { "index" : { "_index" : "test", "_id" : "1" } }
    { "field1" : "value1" }
    { "delete" : { "_index" : "test", "_id" : "2" } }
    { "create" : { "_index" : "test2", "_id" : "3" } }
    { "field1" : "value3" }
    { "update" : {"_index" : "test" , "_id" : "1" } }
    { "doc" : {"field2" : "value2"} }
    
    //返回4個操作結果
    {
      "took" : 324,
      "errors" : false,
      "items" : [
        {
          "index" : {
            "_index" : "test",
            "_type" : "_doc",
            "_id" : "1",
            "_version" : 5,
            "result" : "updated",
            "_shards" : {
              "total" : 2,
              "successful" : 2,
              "failed" : 0
            },
            "_seq_no" : 5,
            "_primary_term" : 1,
            "status" : 200
          }
    },
    {
      "delete" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 2,
        "result" : "not_found",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 6,
        "_primary_term" : 1,
        "status" : 404
      }
    },
    {
      "create" : {
        "_index" : "test2",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "update" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 6,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 7,
        "_primary_term" : 1,
        "status" : 200
      }
    }
    ]
    }

批量讀取- mget

  • 批量操作,可以減少網路連線所產生的開銷,提高效能
     GET /_mget
    {
        "docs" : [
            {
                "_index" : "test",
                "_id" : "1"
            },
            {
                "_index" : "test",
                "_id" : "2"
            }
        ]
    }
    //返回結果
    {
      "docs" : [
        {
          "_index" : "test",
          "_type" : "_doc",
          "_id" : "1",
          "_version" : 6,
          "_seq_no" : 7,
          "_primary_term" : 1,
          "found" : true,
          "_source" : {
            "field1" : "value1",
            "field2" : "value2"
          }
        },
        {
          "_index" : "test",
          "_type" : "_doc",
          "_id" : "2",
          "found" : false
        }
      ]
    }

批量查詢 -msearch

  • 對不同的索引,進行不同的search
    // 由於我這邊沒有測試資料 所以複製了下程式碼沒有敲
    POST kibana_sample_data_ecommerce/_msearch
    {}
    {"query" : {"match_all" : {}},"size":1}
    {"index" : "kibana_sample_data_flights"}
    {"query" : {"match_all" : {}},"size":2}

    常見錯誤返回

    問題 原因
    無法連線 網路故障或者叢集掛了
    連線無法關閉 網路故障或者節點出錯
    429 叢集過於繁忙(重試或者增加節點已增加吞吐量)
    4xx 請求體格式有誤
    500 叢集內部錯誤

常見問題

  • Q:不要傳送過多的資料
  • A:一般1000-5000個左右文件,5-15M

個人總結

` index 方法 就是 先刪除 後先入 ,update 方法就是 修改原資料,兩者都會使版本號往上增加

相關文章