curl 127.0.0.1:9200/test/_search | jq
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "Video_1",
"_score": 1,
"_source": {
"id": 1,
"title": "打火車"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "Video_2",
"_score": 1,
"_source": {
"id": 2,
"title": "火車"
}
}
]
}
}
curl 127.0.0.1:9200/test/_search?q=打火車 | jq
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.21110919,
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "Video_2",
"_score": 0.21110919,
"_source": {
"id": 2,
"title": "火車"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "Video_1",
"_score": 0.160443,
"_source": {
"id": 1,
"title": "打火車"
}
}
]
}
}
- 這時候我們驚奇的發現
火車
的分值是0.21110919
居然比打火車
的0.160443
還高
curl 127.0.0.1:9200/test/_doc/Video_1/_termvectors?fields=title | jq
{
"_index": "test",
"_type": "_doc",
"_id": "Video_1",
"_version": 1,
"found": true,
"took": 0,
"term_vectors": {
"title": {
"field_statistics": {
"sum_doc_freq": 3,
"doc_count": 2,
"sum_ttf": 3
},
"terms": {
"打火": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 2
}
]
},
"火車": {
"term_freq": 1,
"tokens": [
{
"position": 1,
"start_offset": 1,
"end_offset": 3
}
]
}
}
}
}
}
- 很驚奇的發現打火車被劃分成
打火
和火車
兩個詞, 所以這之中肯定有問題了(當然對於搜尋引擎是沒有問題的). 打火車
文件中的火車
得到了分值,但打火
會使搜尋得分下降, 導致火車
文件的排名靠前- 所以我決定把兩個分詞器設定成一樣
{
"properties": {
"title": {
"type": "text",
"analyzer": "ik_smart",
"search_analyzer": "ik_smart"
}
}
}
- 然後再看一下分詞資料(這次分詞的資料的確是我們預想的)
curl 127.0.0.1:9200/test/_doc/Video_1/_termvectors?fields=title | jq
{
"_index": "test",
"_type": "_doc",
"_id": "Video_1",
"_version": 1,
"found": true,
"took": 0,
"term_vectors": {
"title": {
"field_statistics": {
"sum_doc_freq": 3,
"doc_count": 2,
"sum_ttf": 3
},
"terms": {
"打": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 1
}
]
},
"火車": {
"term_freq": 1,
"tokens": [
{
"position": 1,
"start_offset": 1,
"end_offset": 3
}
]
}
}
}
}
}
- 這時我們再搜尋一次資料排名, 看到得分值排名的確是我們想要的了.
curl 127.0.0.1:9200/test/_search?q=打火車 | jq
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.77041256,
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "Video_1",
"_score": 0.77041256,
"_source": {
"id": 1,
"title": "打火車"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "Video_2",
"_score": 0.21110919,
"_source": {
"id": 2,
"title": "火車"
}
}
]
}
}
本作品採用《CC 協議》,轉載必須註明作者和本文連結
當神不再是我們的信仰,那麼信仰自己吧,努力讓自己變好,不辜負自己的信仰!