1. 題目描述
有一個index=index_a,只有一列title;
請以此index_a為基礎, 保留title;
增加len列,內容為title列的長度;
增加split_title列,內容為title列用空格分割的陣列;
2. 題目準備
PUT /index_a/_doc/1
{
"title": "Thinking in java 4th"
}
3. 建立 ingest pipeline
建立一個名為pipeline_a的pipeline
PUT _ingest/pipeline/pipeline_a
{
"processors": [
{
"script": { ## 3.1 script
"source": "ctx.len=ctx.title.length();"
}
},
{
"set": { ## 3.2 set
"field": "split_title",
"value": ""
}
},
{
"split": { ## 3.3 split
"field": "title",
"separator": " ",
"target_field": "split_title"
}
}
]
}
這個pipeline的建立裡,使用了pipeline的3個processor, 分別如下:
3.1 script
script 給index增加一個len欄位, 值為 title 欄位的長度ctx.len=ctx.title.length();
3.2 set
split_title 給index增加了一個欄位 split_title, 值設定為空字串
3.3 split
split 給index做一個split處理, 輸入目標是 title
欄位, 輸出到欄位 split_title
上
由於processor是一個一個流水執行的, 下一個,可以用到上一個的, 所以會正確達到我們預期
4. reindex並使用pipeline
POST _reindex
{
"source": {
"index": "index_a"
},
"dest": {
"index": "index_b",
"op_type": "create",
"pipeline": "pipeline_a"
}
}
5. 驗證
GET /index_b/_search
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "index_b",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"len" : 20,
"split_title" : [
"Thinking",
"in",
"java",
"4th"
],
"title" : "Thinking in java 4th"
}
}
]
}
}