- 預設情況下,查詢按照相關度算分排序,返回前10條記錄
- 容易理解的分頁方案
- From : 開始位置
- Size:期望獲取文件的總數
- ES 天生就是分散式,查詢資訊,但是資料分別儲存在多個分片,多臺機器,ES天生就需要滿足排序的需要(按照相關性算分)
- 當一個查詢:From = 990 ,Size =10
- 會在每個分片上先獲取1000個文件。然後,通過Coordinating Node 聚合所有結果。最後在通過排序選取前1000個文件
- 頁數越深,佔用內容越多。為了避免深度分頁帶來的記憶體開銷。ES有個設定,預設限定到10000個文件
POST tmdb/_search
{
"from": 10000,
"size": 1,
"query": {
"match_all": {}
}
}
//
- 避免深度分頁的效能問題,可以實時獲取下一頁文件資訊
- 第一步搜尋需要指定sort,並且保證值是唯一的(可以通過加入_id保證唯一性)
- 然後使用上一次,最後一個文件的sort值進行查詢
POST users/_doc
{"name":"user1","age":10}
POST users/_doc
{"name":"user2","age":11}
POST users/_doc
{"name":"user2","age":12}
POST users/_doc
{"name":"user2","age":13}
POST users/_count
POST users/_search
{
"size": 1,
"query": {
"match_all": {}
},
"sort": [
{"age": "desc"} ,
{"_id": "asc"}
]
}
//返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "users",
"_type" : "_doc",
"_id" : "I5aMPW8Bb23XqE-8Pu1n",
"_score" : null,
"_source" : {
"name" : "user2",
"age" : 13
},
"sort" : [
13,
"I5aMPW8Bb23XqE-8Pu1n"
]
}
]
}
}
POST users/_search
{
"size": 1,
"query": {
"match_all": {}
},
"search_after":
[
10,
"H5aMPW8Bb23XqE-8IO1c"
],
"sort": [
{"age": "desc"} ,
{"_id": "asc"}
]
}
- 假設Size是10
- 當查詢990 -100
- 通過唯一排序值定位,將每次要處理的文件都控制在10
- 建立一個快照,有新的資料寫入以後,無法被查詢
- 每次查詢後,輸入上一次的Sroll Id
DELETE users
POST users/_doc
{"name":"user1","age":10}
POST users/_doc
{"name":"user2","age":20}
POST users/_doc
{"name":"user3","age":30}
POST users/_doc
{"name":"user4","age":40}
POST /users/_search?scroll=5m
{
"size": 1,
"query": {
"match_all" : {
}
}
}
POST /_search/scroll
{
"scroll" : "1m",
"scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAf3oEWQlQzZnpzdzlRdEdIUDFiRndaQU5BZw=="
}
- Regular
- Scorll
- Pagination
- From 和 Size
- 如何需要深度分頁,則選用Search After