elasticsearch之警惕inner hits的效能問題

無風聽海發表於2022-01-06

原文網址 : https://www.cnblogs.com/wufengtinghai/p/15773220.html

Elasticsearch

一、inner hits簡介

elasticsearch提供了nested資料型別來處理主子文件的問題，可以解決子文件欄位被分裂平鋪導致欄位之間失去了整體的關聯性；

elasticsearch提供的inner hits主要完成在通過子文件進行匹配查詢的時候，可以方便控制匹配的子文件的返回；

二、資料描述

資料結構及index情況可以參考 elasticsearch支援大table格式資料的搜尋

三、問題簡介

通過一個簡單的ip來搜尋，只匹配了一個主文件，而且返回了十個子元素，並進行了高亮處理；

查詢語句

{
  "_source": {
    "excludes": [
      "content"
    ]
  },
  "query": {
    "bool": {
      "should": {
        "nested": {
          "path": "content",
          "query": {
            "query_string": {
              "query": "192.168.1.1*",
              "fields": [
                "content.*"
              ]
            }
          },
          "inner_hits": {
            "from": 0,
            "size": 10,
            "highlight": {
              "fields": {
                "*": {}
              },
              "fragment_size": 1000
            }
          },
          "score_mode": "avg",
          "ignore_unmapped": true
        }
      }
    }
  },
  "size": 20,
  "timeout": "20s"
}

執行語句的時間長達3111ms，只是匹配了一個文件，並且只高亮返回10個子文件，時間不至於這麼長；

{
    "took":3111,
    "timed_out":false,
    "_shards":{
        "total":1,
        "successful":1,
        "skipped":0,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":0.001722915,
        "hits":[
		]
    }
}

四、定位問題

執行以下語句，使用profile api來檢視query執行的時間；


{
  "profile": true,
  "_source": {
    "excludes": [
      "content"
    ]
  },
  "query": {
    "bool": {
      "should": {
        "nested": {
          "path": "content",
          "query": {
            "query_string": {
              "query": "192.168.1.1*",
              "fields": [
                "content.*"
              ]
            }
          },
          "inner_hits": {
            "from": 0,
            "size": 10,
            "highlight": {
              "fields": {
                "*": {}
              },
              "fragment_size": 1000
            }
          },
          "score_mode": "avg",
          "ignore_unmapped": true
        }
      }
    }
  },
  "size": 20,
  "timeout": "20s"
}

通過profile部分，我們可以看到整個search的時間不到20ms，肯定不是查詢導致的問題了；

{
    "took":2859,
    "timed_out":false,
    "profile":{
        "shards":[
            {
                "searches":[
                    {
                        "query":[
                            {
                                "type":"BooleanQuery",
                                "time":"9.9ms",
                                "time_in_nanos":9945310,
                                "breakdown":{
                                    "score":9349172,
                                    "build_scorer_count":6,
                                    "match_count":0,
                                    "create_weight":398951,
                                    "next_doc":1262,
                                    "match":0,
                                    "create_weight_count":1,
                                    "next_doc_count":1,
                                    "score_count":1,
                                    "build_scorer":176010,
                                    "advance":19905,
                                    "advance_count":1
                                }
                            }
                        ],
                        "rewrite_time":41647,
                        "collector":[
                            {
                                "name":"CancellableCollector",
                                "reason":"search_cancelled",
                                "time":"9.3ms",
                                "time_in_nanos":9376796,
                                "children":[
                                    {
                                        "name":"SimpleTopScoreDocCollector",
                                        "reason":"search_top_hits",
                                        "time":"9.3ms",
                                        "time_in_nanos":9355874
                                    }
                                ]
                            }
                        ]
                    }
                ],
                "aggregations":[

                ]
            }
        ]
    }
}

是不是高亮的問題呢？

去掉查詢語句中的高亮部分，執行如下查詢語句；

{
  "_source": {
    "excludes": [
      "content"
    ]
  },
  "query": {
    "bool": {
      "should": {
        "nested": {
          "path": "content",
          "query": {
            "query_string": {
              "query": "192.168.1.1*",
              "fields": [
                "content.*"
              ]
            }
          },
          "inner_hits": {
            "from": 0,
            "size": 10
          },
          "score_mode": "avg",
          "ignore_unmapped": true
        }
      }
    }
  },
  "size": 20,
  "timeout": "20s"
}

可以看到執行時間並沒有什麼大的變化；

{
    "took":3117,
    "timed_out":false,
    "_shards":{
        "total":1,
        "successful":1,
        "skipped":0,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":0.001722915,
        "hits":[
            {
                 "inner_hits":{
                    "content":{
                        "hits":{
                            "total":400000,
                            "max_score":0.001722915,
                            "hits":[
                             ]
                        }
                    }
                }
            }
        ]
    }
}

現在剩下的只能是跟返回的文件有關係了；

禁止返回主文件，執行如下查詢語句；

{
  "_source": false,
  "query": {
    "bool": {
      "should": {
        "nested": {
          "path": "content",
          "query": {
            "query_string": {
              "query": "192.168.1.1*",
              "fields": [
                "content.*"
              ]
            }
          },
          "inner_hits": {
            "from": 0,
            "size": 10
          },
          "score_mode": "avg",
          "ignore_unmapped": true
        }
      }
    }
  },
  "size": 20,
  "timeout": "20s"
}

可以看到時間還是沒有什麼變化；

{
    "took":2915,
    "timed_out":false,
    "_shards":{
        "total":1,
        "successful":1,
        "skipped":0,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":0.001722915,
        "hits":[
            {
                 "inner_hits":{
                    "content":{
                        "hits":{
                            "total":400000,
                            "max_score":0.001722915,
                            "hits":[
                             ]
                        }
                    }
                }
            }
        ]
    }
}

修改查詢語句，禁止返回子文件，執行以下語句

{
  "_source": false,
  "query": {
    "bool": {
      "should": {
        "nested": {
          "path": "content",
          "query": {
            "query_string": {
              "query": "192.168.1.1*",
              "fields": [
                "content.*"
              ]
            }
          },
          "inner_hits": {
            "from": 0,
            "size": 0
          },
          "score_mode": "avg",
          "ignore_unmapped": true
        }
      }
    }
  },
  "size": 20,
  "timeout": "20s"
}

可以看到10ms就執行完成了；

{
    "took":10,
    "timed_out":false,
    "_shards":{
        "total":1,
        "successful":1,
        "skipped":0,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":0.001722915,
        "hits":[
            {
                "_type":"_doc",
                "_score":0.001722915,
                "inner_hits":{
                    "content":{
                        "hits":{
                            "total":400000,
                            "max_score":0,
                            "hits":[

                            ]
                        }
                    }
                }
            }
        ]
    }
}

五、問題原因分析

通過以上分析我們可以知道，由於返回了10個子文件，導致了執行時間的增長；從直觀考慮來說淡出的返回10個不大的文件，不至於會耗時這麼長時間啊；

inner hits提供了from和size來控制返回子文件的數量，我們以為可以像普通的查詢那樣使用，但是這裡size的預設值是3，from+size必須小於100；

{
                "type":"illegal_argument_exception",
                "reason":"Inner result window is too large, the inner hit definition's [null]'s from + size must be less than or equal to: [100] but was [101]. This limit can be set by changing the [index.max_inner_result_window] index level setting."
            }

既然有這個限制，那麼肯定是inner hit的效能不是很好，肯定跟nested type的儲存結構和inner hits的實現機制有關係了；其實由於主文件和所有相關的子文件資料都儲存在父文件的source欄位，導致返回子文件的時候
，需要載入和解析主文件的source欄位，並定位處理子文件；通過上邊的查詢返回結果可以看到，雖然只匹配了一個主文件，但是這個主文件下有40W的子文件，這麼多的文件勢必會導致source很大，最終導致執行時間的暴漲；

ested document don’t have a _source field, because the entire source of document is stored with the root document under its _source field. To include the source of just the nested document, the source of the root document is parsed and just the relevant bit for the nested document is included as source in the inner hit. Doing this for each matching nested document has an impact on the time it takes to execute the entire search request, especially when size and the inner hits' size are set higher than the default. To avoid the relatively expensive source extraction for nested inner hits, one can disable including the source and solely rely on doc values fields.

六、解決方案

單個文件只會儲存在單個分片上，無法通過增加分片提高查詢的速度；
文件提到了禁用source，並依賴doc values欄位，但是經測試查詢時間基本沒有任何改善；
減少返回的子文件個數，可以顯著的降低查詢時間，例如下邊返回3個；

{
    "took":967,
    "timed_out":false,
    "_shards":{
        "total":1,
        "successful":1,
        "skipped":0,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":0.001722915,
        "hits":[
            {
                "_type":"_doc",
                "_score":0.001722915,
                "inner_hits":{
                    "content":{
                        "hits":{
                            "total":100008,
                            "max_score":0.001722915
                        }
                    }
                }
            }
        ]
    }
}

static inner class和非static inner class的例項化問題（can only instantiate non-static inner class...）
2019-02-12
ElasticSearch第3篇大資料處理3大問題（“10000條”問題解決方案、hits total值統計總數不精確解決方案、大資料深度分頁效能問題3種最佳化方案）
2024-07-28
Elasticsearch大資料
dotnet 已知問題警惕 StreamReader 的 EndOfStream 卡住執行緒
2024-09-05
執行緒
Elasticsearch 問題總結
2020-06-29
Elasticsearch
Flink實戰之寫Hive效能問題
2020-11-27
Hive
Elasticsearch聚合學習之五：排序結果不準的問題分析
2022-09-19
Elasticsearch排序
MySQL問題定位-效能優化之我見
2021-07-28
MySql優化
SpringBoot 使用 Elasticsearch 問題彙總
2022-09-27
Spring BootElasticsearch
集合效能問題
2018-12-10
QtWebEngine效能問題
2022-04-30
QTWeb
關於 Puerts 的效能問題
2024-07-06
大腦無特權：警惕免疫系統帶來的精神健康問題
2020-03-22
兩個小問題深入淺出List的效能問題
2023-02-25
ElasticSearch實戰系列十一: ElasticSearch錯誤問題解決方案
2021-05-24
Elasticsearch
sql的left join 、right join 、inner join之間的區別
2018-07-02
SQL
Windows下ElasticSearch安裝中的問題解決
2018-05-28
WindowsElasticsearch
Elasticsearch中關於transform的一個問題分析
2021-12-07
ElasticsearchORM
ElasticSearch效能調優
2019-01-15
Elasticsearch
ElasticSearch效能原理拆解
2024-06-04
Elasticsearch
適配Android P(9.0)的問題（百度地圖）HttpClient: Catch connection exception, INNER_ERROR
2018-11-27
Android地圖HTTPclientExceptionError
效能測試瓶頸之CPU問題分析與調優
2024-08-05
(轉)認清效能問題
2018-12-23
效能測試工具的 Coordinated Omission 問題
2018-06-03
一個CRM OData的效能問題分析
2020-02-16
.net異常處理的效能問題
2020-10-28
mysql 字符集造成的效能問題
2020-04-26
MySql
故障分析 | show processlist 引起的效能問題
2022-07-18
解決吞吐效能問題時的思路
2021-07-05
laravel scout + elasticsearch-rtf 索引無效問題
2021-08-09
LaravelElasticsearch索引
Java程式碼解決ElasticSearch的Result window is too large問題
2018-11-23
JavaElasticsearch
記一下Laravel中使用Scout+Elasticsearch 的問題
2022-01-24
LaravelElasticsearch
警惕！移動支付最常見安全問題是個人資訊洩露
2018-12-27
【效能測試】常見的效能問題分析思路（二）案例&技巧
2021-11-06
你知道MySQL的Limit有效能問題嗎
2019-05-08
MySqlMIT
效能永遠不是優先考慮的問題
2019-05-14
ab個性化實驗的效能問題
2024-07-16
一次容器MySQL的效能問題排查
2021-12-01
MySql
MySQL8.0 view導致的效能問題
2022-05-26
MySqlView

elasticsearch之警惕inner hits的效能問題

相關文章