- 結構化搜尋(Structured search) 是指對結構化資料的搜尋
- 文字也可以是結構化的
- 如彩色筆可以有離散的顏色集合:紅(red)、綠(green)、藍(blue)
- 一個部落格可能被標記了標籤,例如,分散式(distributed)和搜尋(search)
- 電商網站上的商品都有UPCs(通用產品碼 Universal Product Codes)或其他的唯一標識,它們都遵從嚴格規定的、結構化的格式
- 布林、時間,日期和數字這類結構化資料:有精確的格式,我們可以對這些格式進行邏輯操作。包括比較數字或時間的範圍,或判斷兩個值的大小
- 結構化的文字可以做到精確匹配或者部分匹配
- 結構化結構只有“是”或“否”兩個值
DELETE products
POST /products/_bulk
{"index":{"_id":1}}
{"price":10,"avaliable":true,"date":"2018-01-01","productID":"XHDK-A-1293-#fJ3"}
{"index":{"_id":2}}
{"price":20,"avaliable":true,"date":"2019-01-01","productID":"KDKE-B-9947-#kL5"}
{"index":{"_id":3}}
{"price":30,"avaliable":true,"productID":"JODL-X-1937-#pV7"}
{"index":{"_id":4}}
{"price":30,"avaliable":false,"productID":"QQPX-R-3956-#aD8"}
#檢視mapping
GET products/_mapping
{
"products" : {
"mappings" : {
"properties" : {
"avaliable" : {
"type" : "boolean"
},
"date" : {
"type" : "date"
},
"price" : {
"type" : "long"
},
"productID" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
對布林值 match 查詢,有算分
POST products/_search
{
"profile": "true",
"explain": true,
"query": {
"term": {
"avaliable": true
}
}
}
對布林值,通過constant score 轉成 filtering,沒有算分
POST products/_search
{
"profile": "true",
"explain": true,
"query": {
"constant_score": {
"filter": {
"term": {
"avaliable": true
}
},
"boost": 1.2
}
}
}
數字型別 Term
POST products/_search
{
"profile": "true",
"explain": true,
"query": {
"constant_score": {
"filter": {
"term": {
"price": 30
}
},
"boost": 1.2
}
}
}
數字型別 terms
POST products/_search
{
"profile": "true",
"explain": true,
"query": {
"constant_score": {
"filter": {
"terms": {
"price": [
"20",
"30"
]
}
}
}
}
}
數字 Range 查詢
POST products/_search
{
"profile": "true",
"explain": true,
"query": {
"constant_score": {
"filter": {
"range": {
"price": {
"gte": 20,
"lte":30
}
}
}
}
}
}
exists 查詢 - 非空查詢
POST products/_search
{
"query" : {
"constant_score" : {
"filter" : {
"exists": {
"field":"date"
}
}
}
}
}
字元型別 terms
POST products/_search
{
"query": {
"constant_score": {
"filter": {
"terms": {
"productID.keyword": [
"QQPX-R-3956-#aD8",
"JODL-X-1937-#pV7"
]
}
}
}
}
}
#demo
POST /movies/_bulk
{"index":{"_id":1}}
{"title":"Father of the Bridge Part II","year":1995,"genre":"Comedy"}
{"index":{"_id":2}}
{"title":"Dave","year":1993,"genre":["Comedy","Romance"]}
處理多值欄位,term 查詢是包含,而不是等於
//返回2條資料
POST movies/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"genre.keyword": "Comedy"
}
}
}
}
}
POST products/_search
{
"profile": "true",
"explain": true,
"query": {
"term": {
"date": "2019-01-01"
}
}
}
POST products/_search
{
"profile": "true",
"explain": true,
"query": {
"match": {
"date": "2019-01-01"
}
}
}
- 機構化資料 & 結構化搜尋
- 如果不需要算分,可以通過Constant Score ,將查詢轉為Filterng
- 範圍查詢 和 Date Match
- 使用Exist 查詢處理非空NULL值
- 精確值 & 多值欄位的精確值查詢
- Term 查詢是包含,不是完全相等。針對多值欄位查詢要尤其注意
備註
- 什麼時候用term 跟match
- 結構化資料的精確匹配,就使用term查詢。日期屬於結構化資料。match主要用於文字的 full-text 查詢