66_索引管理_複雜上機實驗:基於scoll+bulk+索引別名實現零停機重建索引

5765809發表於2024-10-02

課程大綱

1、重建索引

一個field的設定是不能被修改的,如果要修改一個Field,那麼應該重新按照新的mapping,建立一個index,然後將資料批次查詢出來,重新用bulk api寫入index中

批次查詢的時候,建議採用scroll api,並且採用多執行緒併發的方式來reindex資料,每次scoll就查詢指定日期的一段資料,交給一個執行緒即可

(1)一開始,依靠dynamic mapping,插入資料,但是不小心有些資料是2017-01-01這種日期格式的,所以title這種field被自動對映為了date型別,實際上它應該是string型別的

PUT /my_index/my_type/3
{
"title": "2017-01-03"
}

{
"my_index": {
"mappings": {
"my_type": {
"properties": {
"title": {
"type": "date"
}
}
}
}
}
}

(2)當後期向索引中加入string型別的title值的時候,就會報錯

PUT /my_index/my_type/4
{
"title": "my first article"
}

{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "failed to parse [title]"
}
],
"type": "mapper_parsing_exception",
"reason": "failed to parse [title]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Invalid format: "my first article""
}
},
"status": 400
}

(3)如果此時想修改title的型別,是不可能的

PUT /my_index/_mapping/my_type
{
"properties": {
"title": {
"type": "text"
}
}
}

{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "mapper [title] of different type, current_type [date], merged_type [text]"
}
],
"type": "illegal_argument_exception",
"reason": "mapper [title] of different type, current_type [date], merged_type [text]"
},
"status": 400
}

(4)此時,唯一的辦法,就是進行reindex,也就是說,重新建立一個索引,將舊索引的資料查詢出來,再匯入新索引

(5)如果說舊索引的名字,是old_index,新索引的名字是new_index,終端java應用,已經在使用old_index在操作了,難道還要去停止java應用,修改使用的index為new_index,才重新啟動java應用嗎?這個過程中,就會導致java應用停機,可用性降低

(6)所以說,給java應用一個別名,這個別名是指向舊索引的,java應用先用著,java應用先用goods_index alias來操作,此時實際指向的是舊的my_index

PUT /my_index/_alias/goods_index

(7)新建一個index,調整其title的型別為string

PUT /my_index_new
{
"mappings": {
"my_type": {
"properties": {
"title": {
"type": "text"
}
}
}
}
}

(8)使用scroll api將資料批次查詢出來

GET /my_index/_search?scroll=1m
{
"query": {
"match_all": {}
},
"sort": ["_doc"],
"size": 1
}

{
"_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAADpAFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAA6QRY0b25zVFlWWlRqR3ZJajlfc3BXejJ3AAAAAAAAOkIWNG9uc1RZVlpUakd2SWo5X3NwV3oydwAAAAAAADpDFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAA6RBY0b25zVFlWWlRqR3ZJajlfc3BXejJ3",
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": null,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "2",
"_score": null,
"_source": {
"title": "2017-01-02"
},
"sort": [
0
]
}
]
}
}

(9)採用bulk api將scoll查出來的一批資料,批次寫入新索引

POST /_bulk
{ "index": { "_index": "my_index_new", "_type": "my_type", "_id": "2" }}

(10)反覆迴圈8~9,查詢一批又一批的資料出來,採取bulk api將每一批資料批次寫入新索引

(11)將goods_index alias切換到my_index_new上去,java應用會直接透過index別名使用新的索引中的資料,java應用程式不需要停機,零提交,高可用

POST /_aliases
{
"actions": [
{ "remove": { "index": "my_index", "alias": "goods_index" }},
{ "add": { "index": "my_index_new", "alias": "goods_index" }}
]
}

(12)直接透過goods_index別名來查詢,是否ok

GET /goods_index/my_type/_search

2、基於alias對client透明切換index

PUT /my_index_v1/_alias/my_index

client對my_index進行操作

reindex操作,完成之後,切換v1到v2

POST /_aliases
{
"actions": [
{ "remove": { "index": "my_index_v1", "alias": "my_index" }},
{ "add": { "index": "my_index_v2", "alias": "my_index" }}
]
}

相關文章