Elasticsearch查詢

不要亂摸發表於2018-12-01

原文網址 : https://www.cnblogs.com/cjsblog/p/9910788.html

Elasticsearch

Query DSL

Elasticsearch提供了一個基於JSON的完整的查詢DSL（領域特定語言）。它定義的查詢語言由兩種型別的子句組成：“葉子查詢子句”和“組合查詢子句”。

葉子查詢子句

葉子查詢子句查詢特定欄位中的特定值，例如 match、term 或 range 查詢。

複合查詢子句

複合查詢子句包裝其他葉子或複合查詢，並用於以邏輯方式組合多個查詢（如 bool 或 dis_max 查詢），或更改其行為（如 constant_score 查詢）。

1. Query and filter context

查詢子句的行為取決於它是用在查詢上下文（query context）還是用在過濾器上下文（filter context）：

1.1. Query context

在查詢上下文中的查詢子句回答了“這個文件與這個查詢子句的匹配程度是怎樣的？”問題。除了決定文件是否匹配以外，查詢子句還會計算一個“_score”，它表示文件與其他文件的相關程度。

1.2. Filter context

在過濾器上下文中，一個查詢子句回答了“這個文件與查詢子句匹配嗎？”的問題。這個答案是簡單的Yes或者No，也不會計算分數。過濾上下文主要用於過濾結構化資料，例如：

這個timestamp在2015年到2016年的範圍內嗎？
這個status欄位的值是“published”嗎？

（

PS：Query VS Filter

查詢反應的是文件與查詢子句的匹配程度，而過濾反應的是文件是否匹配查詢子句
一個是篩選是否滿足條件，情況無非兩種：是或不是；一個是看滿足條件的記錄與查詢條件的匹配程度
哪些滿足條件，這是過濾；滿足條件的這些記錄與條件的匹配程度，這是查詢
過濾不會計算評分，查詢會計算評分

）

頻繁使用的過濾器將被Elasticsearch自動快取，以提高效能。

當查詢子句中被傳遞了一個filter引數時過濾器上下文就生效了。例如，bool查詢中的filter引數或者must_not引數。

下面是一個查詢子句的例子，這個查詢將匹配滿足以下所有條件的文件：

title 欄位包含單詞“search”
content 欄位包含單詞“elasticsearch”
status 欄位包含明確的單詞“published”
publish_date 欄位的包含的日期大於或等於2015-01-01

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": { 
        "bool": { 
            "must": [
                { "match": { "title":   "Search"        }}, 
                { "match": { "content": "Elasticsearch" }}  
            ],
            "filter": [ 
                { "term":  { "status": "published" }}, 
                { "range": { "publish_date": { "gte": "2015-01-01" }}} 
            ]
        }
    }
}
'

關於上面的查詢子句作如下說明：

quary 參數列示這是一個查詢上下文
bool 和兩個match子句用在查詢上下文中，表明它們參與每條文件的打分
filter 參數列明這是過濾器上下文
term 和 range 子句用在過濾器上下文中，它們會過濾掉不匹配的文件，而且不會影響匹配文件的分數

（PS：類比SQL的話，match相當於模糊查詢，term相當於精確查詢，range相當於範圍查詢）

2. Match All Query

最簡單的查詢，匹配所有文件，使它們的_score為1.0

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_all": {}
    }
}
'

_score可以被改變，通過用boost引數

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_all": { "boost" : 1.2 }
    }
}
'

與match_all相反的是match_none，它不匹配任何文件

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_none": {}
    }
}
'

3. Full text queries

3.1. Match Query

match查詢接受文字/數值/日期型別的資料，分析它們，並構造一個查詢。

match是一種布林型別的查詢。這意味著它對提供的文字進行分析，並在分析的過程中為提供的文字構造一個布林查詢。operator 選項可以設定為 or 或者 and 以此來控制布林子句（預設是 or ）。例如：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match" : {
            "message" : "this is a test"
        }
    }
}
'

注意，查詢語句都是以“query”開頭的，這裡“message”是欄位名

你也可以加一些引數，比如：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match" : {
            "message" : {
                "query" : "this is a test",
                "operator" : "and"
            }
        }
    }
}
'

（PS：match是模糊查詢）

3.2. Match Phrase Query

match_phrase 查詢與 match類似，但是它是用於精確匹配或單詞接近匹配的。例如：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_phrase" : {
            "message" : "this is a test"
        }
    }
}
'

當然，你也可以加引數

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_phrase" : {
            "message" : {
                "query" : "this is a test",
                "analyzer" : "my_analyzer"
            }
        }
    }
}
'

這裡“analyzer”是用來設定用那個分析器來分析文字

3.3. Match Phrase Prefix Query

類似於match_phrase查詢，但是對最後一個單詞進行萬用字元搜尋。

match_phrase_prefix允許文字的最後一個單詞進行字首匹配

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_phrase_prefix" : {
            "message" : "quick brown f"
        }
    }
}
'

除了match_phrase允許的那些引數外，match_phrase_prefix還可以接受一個max_expansions引數，它是用來控制最後一個單詞可以擴充套件多少字尾（預設50）。

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_phrase_prefix" : {
            "message" : {
                "query" : "quick brown f",
                "max_expansions" : 10
            }
        }
    }
}
'

3.4. Multi Match Query

multi_match 相當於 match 的多欄位版本

顧名思義，multi_match可以指定多個欄位，而match只能針對一個欄位

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "multi_match" : {
      "query":    "this is a test", 
      "fields": [ "subject", "message" ] 
    }
  }
}
'

另外，欄位可以用萬用字元，例如下面的例子中可以查詢 title ， first_name ， last_name 等欄位：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "multi_match" : {
      "query":    "Will Smith",
      "fields": [ "title", "*_name" ] 
    }
  }
}
'

單個欄位可以被提升，例如：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "multi_match" : {
      "query" : "this is a test",
      "fields" : [ "subject^3", "message" ] 
    }
  }
}
'

上面的例子，subject欄位的重要性是message欄位的三倍

3.5. Query String Query

支援Lucene查詢字串語法，允許指定 AND | OR | NOT ，並且在單個查詢字串中進行多欄位查詢

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "query_string" : {
            "default_field" : "content",
            "query" : "this AND that OR thus"
        }
    }
}
'

query_string查詢解析輸入並圍繞操作符拆分文字，每個文字部分都是獨立分析的，例如：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "query_string" : {
            "default_field" : "content",
            "query" : "(new york city) OR (big apple)"
        }
    }
}
'

上面的例子中，將被拆分成 “new york city” 和 “big apple” 兩部分，並且每一部分都被分析器獨立分析

注意，按操作符拆分

query_string的引數包括：

query　　例項被解析的查詢文字

default_field　　如果沒有指定字首欄位的話，這是預設的查詢欄位。（預設查詢所有欄位）

default_operator　　如果沒有明確指定操作符的話，那麼這是預設的操作符。例如，如果預設操作符是OR的話，那麼“my name is jack”將被翻譯成“my OR name OR is OR jack”，同理，如果是AND，則被翻譯成“my AND name AND is AND jack”

analyzer　　用來解析查詢字串的解析器的名字

allow_leading_wildcard　　如果設定了，那麼 * 或 ? 允許作為第一個字元。預設是true

lenient　　如果設定為true，則格式失敗將被忽略

在query_string中，多欄位查詢應該這樣寫：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "query_string" : {
            "fields" : ["content", "name"],
            "query" : "this AND that"
        }
    }
}
'

等價於

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "query_string": {
            "query": "(content:this OR name:this) AND (content:that OR name:that)"
        }
    }
}
'

上面兩個是等價的

3.6. Simple Query String Query

simple_query_string 是query_string的一個更簡單、更健壯、更適合面向使用者的版本

使用SimpleQueryParser解析上下文的查詢。與常規的query_string查詢不同，simple_query_string查詢永遠不會丟擲異常，並丟棄查詢的無效部分。

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "simple_query_string" : {
        "query": "\"fried eggs\" +(eggplant | potato) -frittata",
        "fields": ["title^5", "body"],
        "default_operator": "and"
    }
  }
}
'

3.7. 例項練習

準備資料

//    刪除索引
curl -X DELETE "192.168.1.134:9200/book"

//    建立索引
curl -X PUT "192.168.1.134:9200/book" -H 'Content-Type: application/json' -d'
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "_doc" : {
            "properties" : {
                "title":        { "type": "text"  }, 
                "author":         { "type": "text"  }, 
                "introduction": { "type": "text"  },
                "publish_date": { 
                    "type": "date",
                    "format": "yyyy-MM-dd"
                }
            }
        }
    }
}
'

//    檢視索引
curl -X GET "192.168.1.134:9200/book?pretty"

//    插入文件
curl -X PUT "192.168.1.134:9200/book/_doc/1" -H 'Content-Type: application/json' -d'
{
    "title" : "Hello Java",
    "author": "zhangsan",
    "publish_date" : "2008-11-15",
    "introduction" : "This is a book for novice."
}
'

//    檢視文件
curl -X GET "192.168.1.134:9200/book/_search?pretty" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_all": {}
    }
}
'

match查詢（注意，match查詢只能是針對單個欄位）

這個例子中，我們用“Java”查詢到2條，接下來用“Java入門”將查到5條

這是因為解析器會將“Java入門”拆分為“Java”和“入門”兩個單詞，而且預設的操作符是or

也就是說，查詢的結果是title中包含“Java”或者“入門”的記錄

現在變成查詢title中同時包含“Java”和“入門”的記錄，因此只有1條

multi_match多欄位查詢

match_phrase查詢

對比不難發現，同樣的關鍵字“Java從”，用match查出5條，用match_phrase只查出1條

query_string查詢

4. Term level queries（單詞級別查詢）

全文字查詢會在執行之前對查詢字串進行分析，而單詞級別查詢會對儲存在反向索引中的精確的term進行操作。

這些查詢通常用於結構化的資料，比如：numbers ， dates ，enums 等，而不是對全文字欄位。

（PS：也就是說，全文字查詢之前要先對文字內容進行分詞，而單詞級別的查詢直接在相應欄位的反向索引中精確查詢，單詞級別的查詢一般用於數值、日期等型別的欄位上）

4.1. Term Query

在指定的欄位中查詢包含指定的精確的term的文件

term查詢將在反向索引（或者叫倒排索引）中查詢包含特定的精確的term的文件。例如：

curl -X POST "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "term" : { "user" : "Kimchy" } 
  }
}
'

上面的例子，在user欄位的反向索引中查詢包含精確的Kimchy的文件

還可以指定一個boost引數，使這個term查詢比另一個查詢具有更高的相關性得分。例如：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "bool": {
            "should": [
            {
                "term": {
                    "status": {
                        "value": "urgent",
                        "boost": 2.0 
                    }
                }
            },
            {
                "term": {
                    "status": "normal" 
                }
            }
          ]
        }
    }
}
'

這個例子中，urgent查詢子句有一個boost引數值為2.0，這就意味著它的重要程度是後面的normal查詢子句的兩倍，normal子句預設的boost是1.0

4.2. Terms Query

查詢包含指定欄位中指定的任何確切term的文件

篩選出與所提供的terms中任何一個匹配的文件

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "terms" : { "user" : ["kimchy", "elasticsearch"]}
    }
}
'

4.3. Range Query

查詢指定欄位在指定範圍內包含值（日期、數字或字串）的文件。

下面的例子返回age欄位的值在10到20之間的文件：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "range" : {
            "age" : {
                "gte" : 10,
                "lte" : 20,
                "boost" : 2.0
            }
        }
    }
}
'

range查詢可以接受下列引數：

gte　　大於或等於

gt　　大於

lte　　小於或等於

lt　　小於

boost　　設定boost值，預設是1.0

4.3.1. Range on date fields

當range查詢用於date型別的欄位時，範圍可以用Date Math表示：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "range" : {
            "date" : {
                "gte" : "now-1d/d",
                "lt" :  "now/d"
            }
        }
    }
}
'

當使用Date Math將日期四捨五入到最近的日期、月份、小時等時，四捨五入日期取決於範圍的兩端是包含的還是排除的。

例如：

rounded up 向上舍入

rounded down 向下舍入

gt 大於2014-11-18||/M 變成 2014-11-30T23:59:59.999

gte 大於或等於2014-11-18||/M 變成 2014-11-01

lt 小於2014-11-18||/M 變成 2014-11-01

lte 小於或等於2014-11-18||/M 變成2014-11-30T23:59:59.999

這個其實很好理解，

大於2014-11-18||/M相當於是大於2014年11月，因此大於2014-11-18||/M等價於大於2014-11-30 23:59:59

也就是說，大於11月，相當於是大於11月的最後一天，即11-30 23:59:59

同理，大於或等於2014-11-18||/M，相當於大於或等於11月，自然是11月的第一天，即2014-11-01

同理，小於2014-11-18||/M，相當於小於11月，自然是小於11月1日，故而小於2014-11-18||/M等價於小於2014-11-01

同理，小於或等於2014-11-18||/M，等於11月自然是包含11月的，意味著小於11月30日，故而小於或等於2014-11-18||/M等價於小於或等於2014-11-30 23:59:59

4.3.2. Date format in range query

在日期範圍查詢的時候，我們可以指定日期格式。例如：

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "range" : {
            "born" : {
                "gte": "01/01/2012",
                "lte": "2013",
                "format": "dd/MM/yyyy||yyyy"
            }
        }
    }
}
'

這個例子是查詢在2012-01-01到2013-12-31之間出生的人

下面看時間範圍查詢

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "range" : {
            "timestamp" : {
                "gte": "2015-01-01 00:00:00", 
                "lte": "now", 
                "time_zone": "+01:00"
            }
        }
    }
}
'

4.4. Exsit Query

在特定的欄位中查詢非空值的文件

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "exists" : { "field" : "user" }
    }
}
'

4.5. Prefix Query

查詢包含帶有指定字首的term的文件

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{ "query": {
    "prefix" : { "user" : "ki" }
  }
}
'

可以關聯boost

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{ "query": {
    "prefix" : { "user" :  { "value" : "ki", "boost" : 2.0 } }
  }
}
'

4.6. Wildcard Query

支援萬用字元查詢，*表示任意字元，?表示任意單個字元

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "wildcard" : { "user" : "ki*y" }
    }
}
'

可以加boost引數

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "wildcard" : { "user" : { "value" : "ki*y", "boost" : 2.0 } }
    }
}
'

4.7. Regexp Query

正規表示式查詢

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "regexp":{
            "name.first": "s.*y"
        }
    }
}
'

4.8. Ids Query

用_uid欄位查詢

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "ids" : {
            "type" : "_doc",
            "values" : ["1", "4", "100"]
        }
    }
}
'

4.9. 例項練習

5. 複合查詢

複合查詢包裝其他複合查詢或葉子查詢，以組合它們的結果和得分，更改它們的行為，或從查詢切換到篩選上下文。

5.1. 固定分數查詢

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "constant_score" : {
            "filter" : {
                "term" : { "user" : "kimchy"}
            },
            "boost" : 1.2
        }
    }
}
'

5.2. 布林查詢

關於should子句，特別要注意：

如果這個布林查詢位於查詢上下文，並且有must或者filter子句，那麼即使should子句沒有匹配任何文件，也沒關係
如果是位於過濾器上下文，或者既沒有must也沒有filter，那麼至少有一個should查詢必須匹配文件。
這個行為可以通過設定minimum_should_match引數來顯式地控制。

舉個例子：

curl -X POST "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query": {
        "bool" : {
            "must" : {
                "term" : { "user" : "kimchy" }
            },
            "filter": {
                "term" : { "tag" : "tech" }
            },
            "must_not" : {
                "range" : {
                    "age" : { "gte" : 10, "lte" : 20 }
                }
            },
            "should" : [
                { "term" : { "tag" : "wow" } },
                { "term" : { "tag" : "elasticsearch" } }
            ],
            "minimum_should_match" : 1,
            "boost" : 1.0
        }
    }
}
'

查詢user為“kimchy”，並且tag為“tech”，並且age不在10~20之間，並且tag為wow或elasticsearch的文件

filter查詢分數預設是0

curl -X GET "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "status": "active"
        }
      }
    }
  }
}
'

5.3. 例項練習

參考

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html

Elasticsearch複合查詢——boosting查詢
2021-11-17
Elasticsearch
Elasticsearch 高亮查詢
2019-01-24
Elasticsearch
ElasticSearch DSL 查詢
2021-02-23
Elasticsearch
elasticsearch的模糊查詢
2019-01-04
Elasticsearch
Elasticsearch 或並查詢
2019-01-24
Elasticsearch
Elasticsearch（三）：索引查詢
2020-10-21
Elasticsearch索引
elasticsearch之多索引查詢
2021-12-31
Elasticsearch索引
elasticsearch之exists查詢
2023-01-12
Elasticsearch
Elasticsearch 分頁查詢
2021-04-05
Elasticsearch
ElasticSearch的查詢（二）
2021-02-03
Elasticsearch
Elasticsearch中的Term查詢和全文查詢
2021-07-06
Elasticsearch
elasticsearch查詢之大資料集分頁查詢
2022-02-08
Elasticsearch大資料
Elasticsearch 並或查詢 JSON
2019-04-04
ElasticsearchJSON
Elasticsearch系列---聚合查詢(一)
2020-04-02
Elasticsearch
Elasticsearch系列---聚合查詢原理
2020-04-17
Elasticsearch
Elasticsearch——filter過濾查詢
2019-02-19
ElasticsearchFilter
elasticSearch head 查詢報錯
2024-11-12
Elasticsearch
Elasticsearch 查詢與過濾
2021-03-13
Elasticsearch
Elasticsearch 複合查詢——多字串多欄位查詢
2021-03-14
Elasticsearch字串
將聚合新增到 Elasticsearch 查詢
2024-05-17
Elasticsearch
Elasticsearch Query DSL查詢入門
2019-05-17
Elasticsearch
實踐006-elasticsearch查詢之1-URI Search查詢
2022-05-05
Elasticsearch
SpringBoot整合Elasticsearch遊標查詢（scroll）
2020-10-16
Spring BootElasticsearch
ElasticSearch基礎及查詢語法
2019-05-03
Elasticsearch
Elasticsearch——定位不合法的查詢
2019-02-19
Elasticsearch
Elasticsearch複合查詢—constant score query
2021-11-17
Elasticsearch
elasticsearch之單請求多查詢
2023-01-05
Elasticsearch
ElasticSearch類似Mysql的not in 和 in 查詢
2021-09-08
ElasticsearchMySql
Elasticsearch 單字串多欄位查詢
2021-03-15
Elasticsearch字串
【ElasticSearch】給ElasticSearch資料庫配置慢查詢日誌
2021-06-18
Elasticsearch資料庫
ElasticSearch - 分頁查詢方式二【scroll】滾動查詢（kibana、Java示例）
2020-10-20
ElasticsearchJava
實踐007-elasticsearch查詢之2-Request Body與DSL查詢
2022-05-06
Elasticsearch
ES 20 - 查詢Elasticsearch中的資料 (基於DSL查詢, 包括查詢校驗match + bool + term)
2019-06-27
Elasticsearch
從根上理解elasticsearch(lucene)查詢原理(1)-lucece查詢邏輯介紹
2023-12-08
Elasticsearch
基於Lucene查詢原理分析Elasticsearch的效能
2018-10-30
Elasticsearch
Elasticsearch 第六篇：聚合統計查詢
2020-11-06
Elasticsearch
Kibana+Logstash+Elasticsearch 日誌查詢系統
2020-04-05
Elasticsearch
Elasticsearch 結構化搜尋、keyword、Term查詢
2021-03-16
Elasticsearch

Elasticsearch查詢

相關文章