- 最佳欄位(Best Fields)
- 當欄位之間相互競爭,又相互關聯。例如title 和body 這樣的欄位,評分來自最匹配欄位
- 多數字段(Most Fields)
- 處理英文內容時:一種常見的手段是,在主欄位(English Analyzer),抽取詞幹,加入同義詞,以匹配更多的文件。相同的文字,加入子欄位(Standard Analyzer),以提供更加精確的匹配。其他欄位作為匹配文件提高性相關度的訊號。匹配欄位越多越好
- 混合欄位(Cross Field)
- 對於某些實體,例如人名,地址,圖書資訊。需要在多個欄位中確定資訊,單個欄位只能作為整體的一部分。希望在任何這些列出的欄位中儘可能找出多的詞
- Best Fields 是預設型別,可不指定
- Minimum should match 等引數可以傳遞到生成的query中
POST blogs/_search
{
"query": {
"multi_match": {
"type": "best_fields",
"query": "Quick pets",
"fields": ["title","body"],
"tie_breaker": 0.2,
"minimum_should_match": "20%"
}
}
}
查詢案例
PUT /titles
{
"mappings": {
"properties": {
"title":{
"type": "text",
"analyzer": "english"
}
}
}
}
POST titles/_bulk
{"index":{"_id":1}}
{"title":"My dog barks"}
{"index":{"_id":2}}
{"title":"I see a lot of barking dogs on the road "}
GET titles/_search
{
"query": {
"match": {
"title": "barking dogs"
}
}
}
//結果 因為是english 分詞 ,且短 則 id 排第一個
"hits" : [
{
"_index" : "titles",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.24399278,
"_source" : {
"title" : "My dog barks"
}
},
{
"_index" : "titles",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.1854345,
"_source" : {
"title" : "I see a lot of barking dogs on the road "
}
}
]
重新設定mapping
DELETE titles
PUT /titles
{
"mappings": {
"properties": {
"title":{
"type": "text",
"analyzer": "english",
"fields": {
"std":{
"type":"text",
"analyzer":"standard"
}
}
}
}
}
}
POST titles/_bulk
{"index":{"_id":1}}
{"title":"My dog barks"}
{"index":{"_id":2}}
{"title":"I see a lot of barking dogs on the road "}
//multi_match 查詢
GET titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields", //預設是best_fields
"fields": ["title","title.std"]//累計疊加
}
}
}
//返回
"hits" : [
{
"_index" : "titles",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.4569323,
"_source" : {
"title" : "I see a lot of barking dogs on the road "
}
},
{
"_index" : "titles",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.42221838,
"_source" : {
"title" : "My dog barks"
}
}
]
使用多欄位匹配解決
- 用廣度匹配欄位title包括儘可能多的文件- 以提高召回率 ,同時又使用欄位title.std 作為資訊將相關度更高的文件結至於文件頂部
- 每個欄位對於最終評分的貢獻可以通過自定義值boost來控制。比如,使title欄位更為重要,這樣同時也降低了其他訊號欄位的作用
GET titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": ["title^10","title.std"]
}
}
}
跨欄位搜尋
most_fields
無法使用opeartor
- 可以用copy_to解決,但是需要額外的儲存空間
cross_fields
可以支援operator
- 與copy_to 相比,其中一個優勢就是可以在搜尋時為某個欄位提升權重
PUT address/_doc/1
{
"street":"5 Poland Street",
"city" : "Lodon",
"country":"United Kingdom",
"postcode" : "W1V 3DG"
}
POST address/_search
{
"query":{
"multi_match": {
"query": "Poland Street W1V",
"type": "cross_fields", //most_fields查詢為空
"operator": "and",
"fields": ["street","city","country","postcode"]
}
}
}
"hits" : [
{
"_index" : "address",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.8630463,
"_source" : {
"street" : "5 Poland Street",
"city" : "Lodon",
"country" : "United Kingdom",
"postcode" : "W1V 3DG"
}
}
]
本作品採用《CC 協議》,轉載必須註明作者和本文連結