從SQL領域來的使用者,對於ES的檔案關係維護方式會感到很不習慣。畢竟,ES是分散式資料庫只能高效處理獨個扁平型別檔案,無法支援關係式資料庫那樣的檔案拼接。但是,任何資料庫應用都無法避免樹型檔案關係,因為這是業務模式需要的表現形式。在ES裡,無論nested或join型別的資料,父-子關係的資料檔案實際上是放在同一個索引index裡的。在ES裡已經沒有資料表(doc_type)的概念。但從操作層面上ES提供了relation型別來支援父-子資料關係操作。所以,nested資料型別一般用來表達比較固定的嵌入資料。因為每次更新都需要重新對檔案進行一次索引。join型別的資料則可以對資料關係的兩頭分別獨立進行更新,方便很多。
下面我們現示範一下nested資料型別的使用。在mapping裡可以申明nested資料型別來代表嵌入檔案,如下:
val fruitMapping = client.execute(
putMapping("fruits").fields(
KeywordField("code"),
SearchAsYouTypeField("name")
.fields(KeywordField("keyword")),
floatField("price"),
NestedField("location").fields(
KeywordField("shopid"),
textField("shopname"),
longField("qty"))
)
).await
這段程式碼產生了下面的mapping:
{
"fruits" : {
"mappings" : {
"properties" : {
"code" : {
"type" : "keyword"
},
"location" : {
"type" : "nested",
"properties" : {
"qty" : {
"type" : "long"
},
"shopid" : {
"type" : "keyword"
},
"shopname" : {
"type" : "text"
}
}
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"price" : {
"type" : "float"
}
}
}
}
}
location是個nested型別欄位,內嵌檔案格式含shopid,shopname,qty各欄位。下面的例子裡向fruits索引新增了幾個包含了location的檔案:
val f1 = indexInto("fruits").id("f001") .fields( "code" -> "f001", "name" -> "東莞荔枝", "price" -> 11.5, "location" -> List(Map( "shopid" -> "s001", "shopname" -> "中心店", "qty" -> 500.0 ), Map( "shopid" -> "s002", "shopname" -> "東門店", "qty" -> 0.0 ) ) ) val f2 = indexInto("fruits").id("f002") .fields( "code" -> "f002", "name" -> "陝西富士蘋果", "price" -> 11.5, "location" -> List(Map( "shopid" -> "s001", "shopname" -> "中心店", "qty" -> 300.0 ), Map( "shopid" -> "s003", "shopname" -> "龍崗店", "qty" -> 200.0 ) ) ) val f3 = indexInto("fruits").id("f003") .fields( "code" -> "f003", "name" -> "進口菲律賓香蕉", "price" -> 5.3, "location" -> List(Map( "shopid" -> "s001", "shopname" -> "中心店", "qty" -> 300.0 ), Map( "shopid" -> "s003", "shopname" -> "龍崗店", "qty" -> 200.0 ), Map( "shopid" -> "s002", "shopname" -> "東門店", "qty" -> 200.0 ) ) ) val newIndex = for { _ <- client.execute(f1) _ <- client.execute(f2) _ <- client.execute(f3) } yield ("成功增添三條記錄") newIndex.onComplete { case Success(trb) => println(s"${trb}") case Failure(err) => println(s"error: ${err.getMessage}") }
用elastic4s可以比較方便的進行nested型別資料更新。下面是個更新nested檔案的例子:
val f002 = client.execute(get("fruits","f002").fetchSourceInclude("location")).await
val locs: List[Map[String,Any]] = f002.result.source("location").asInstanceOf[List[Map[String,Any]]]
val newloc = Map("shopid" -> "s004","shopname" -> "寶安店", "qty" -> 23)
val newlocs = locs.foldLeft(List[Map[String,Any]]()) { (b, m) =>
if (m("shopid") != newloc("shopid"))
m :: b
else b
}
val newdoc = updateById("fruits","f002")
.doc(
Map(
"location" -> (newloc :: newlocs)
)
)
在上面這個例子裡:需要把一條新的嵌入檔案s004更新到f002檔案裡。我們先把f002裡原來的location取出,去掉s004節點,然後將新節點加入location清單,再更新update f002檔案。
剛才提到過:join型別實際上還是在同一個索引裡實現的。比如我希望記錄每個fruit的進貨歷史,也就是說現在fruit下需要增加一個子檔案purchase_history。這個purchase_history也是在同一個mapping裡定義的:
val fruitMapping = client.execute(
putMapping("fruits").fields(
KeywordField("code"),
SearchAsYouTypeField("name")
.fields(KeywordField("keyword")),
floatField("price"),
NestedField("location").fields(
KeywordField("shopid"),
textField("shopname"),
longField("qty")),
//purchase_history
keywordField("supplier_code"),
textField("supplier_name"),
dateField("purchase_date")
.ignoreMalformed(true)
.format("strict_date_optional_time||epoch_millis"),
joinField("purchase_history")
.relation("fruit","purchase")
)
).await
下面是關於上層父檔案的索引indexing操作的例子:
val f1 = indexInto("fruits").id("f001").routing("f001")
.fields(
"code" -> "f001",
"name" -> "東莞荔枝",
"price" -> 11.5,
"location" -> List(Map(
"shopid" -> "s001",
"shopname" -> "中心店",
"qty" -> 500.0
),
Map(
"shopid" -> "s002",
"shopname" -> "東門店",
"qty" -> 0.0
)
),
"purchase_history" -> "fruit"
)
val f2 = indexInto("fruits").id("f002").routing("f002")
.fields(
"code" -> "f002",
"name" -> "陝西富士蘋果",
"price" -> 11.5,
"location" -> List(Map(
"shopid" -> "s001",
"shopname" -> "中心店",
"qty" -> 300.0
),
Map(
"shopid" -> "s003",
"shopname" -> "龍崗店",
"qty" -> 200.0
)
),
"purchase_history" -> "fruit"
)
val f3 = indexInto("fruits").id("f003").routing("f003")
.fields(
"code" -> "f003",
"name" -> "進口菲律賓香蕉",
"price" -> 5.3,
"location" -> List(Map(
"shopid" -> "s001",
"shopname" -> "中心店",
"qty" -> 300.0
),
Map(
"shopid" -> "s003",
"shopname" -> "龍崗店",
"qty" -> 200.0
),
Map(
"shopid" -> "s002",
"shopname" -> "東門店",
"qty" -> 200.0
)
),
"purchase_history" -> "fruit"
)
val newIndex = for {
_ <- client.execute(f1)
_ <- client.execute(f2)
_ <- client.execute(f3)
} yield ("成功增添三條記錄")
elastic4s子檔案的索引操作示範如下:
val h1 = indexInto("fruits").id("h001").routing("f003")
.fields(
"supplier_code" -> "v001",
"supplier_name" -> "百果園",
"purchase_date" -> "2020-02-09",
"purchase_history" -> Child("purchase", "f003"))
val h2 = indexInto("fruits").id("h002").routing("f002")
.fields(
"supplier_code" -> "v001",
"supplier_name" -> "百果園",
"purchase_date" -> "2019-10-11",
"purchase_history" -> Child("purchase", "f002"))
val h3 = indexInto("fruits").id("h003").routing("f002")
.fields(
"supplier_code" -> "v002",
"supplier_name" -> "華南城花果批發市場",
"purchase_date" -> "2020-01-23",
"purchase_history" -> Child("purchase", "f002"))
val childIndex = for {
_ <- client.execute(h1)
_ <- client.execute(h2)
_ <- client.execute(h3)
} yield ("成功增添三條子記錄")
好了,現在這個fruits索引裡已經包含了nested,join兩種嵌入檔案資料。下面我們就試試各種的讀取方式。首先nested型別資料可以通過nestedQuery讀取:
val qNested = search("fruits").query(
nestedQuery("location").query(
matchQuery("location.shopname","中心")
)
)
println(s"${qNested.show}")
val nestedResult = client.execute(qNested).await
if(nestedResult.isSuccess)
nestedResult.result.hits.hits.foreach(m => println(s"${m.sourceAsMap}"))
else println(s"Error: ${nestedResult.error.causedBy.getOrElse("unknown")}")
...
POST:/fruits/_search?
StringEntity({"query":{"nested":{"path":"location","query":{"match":{"location.shopname":{"query":"中心"}}}}}},Some(application/json))
HashMap(name -> 東莞荔枝, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 500.0), Map(shopid -> s002, shopname -> 東門店, qty -> 0.0)), price -> 11.5, purchase_history -> fruit, code -> f001)
HashMap(name -> 進口菲律賓香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龍崗店, qty -> 200.0), Map(shopid -> s002, shopname -> 東門店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
HashMap(name -> 陝西富士蘋果, location -> List(Map(shopname -> 寶安店, qty -> 23, shopid -> s004), Map(shopname -> 龍崗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)
join型別子檔案可以通過子檔案的ParentID Query讀取:
val qPid = search("fruits").query(
ParentIdQuery("purchase","f002")
)
println(s"${qPid.show}")
val pidResult = client.execute(qPid).await
if(pidResult.isSuccess)
pidResult.result.hits.hits.foreach(m => println(s"${m.sourceAsMap}"))
else println(s"Error: ${pidResult.error.causedBy.getOrElse("unknown")}")
...
POST:/fruits/_search?
StringEntity({"query":{"parent_id":{"type":"purchase","id":"f002"}}},Some(application/json))
Map(supplier_code -> v001, supplier_name -> 百果園, purchase_date -> 2019-10-11, purchase_history -> Map(name -> purchase, parent -> f002))
Map(supplier_code -> v002, supplier_name -> 華南城花果批發市場, purchase_date -> 2020-01-23, purchase_history -> Map(name -> purchase, parent -> f002))
join型別父輩檔案可以通過搜尋其子檔案hasChild獲取:
val qHaschild = search("fruits").query(
hasChildQuery("purchase",
matchQuery("supplier_name","百果")
)
)
println(s"${qHaschild.show}")
val haschildResult = client.execute(qHaschild).await
if(haschildResult.isSuccess)
haschildResult.result.hits.hits.foreach(m => println(s"${m.sourceAsMap}"))
else println(s"Error: ${haschildResult.error.causedBy.getOrElse("unknown")}")
...
POST:/fruits/_search?
StringEntity({"query":{"has_child":{"type":"purchase","score_mode":"none","query":{"match":{"supplier_name":{"query":"百果"}}}}}},Some(application/json))
HashMap(name -> 進口菲律賓香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龍崗店, qty -> 200.0), Map(shopid -> s002, shopname -> 東門店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
HashMap(name -> 陝西富士蘋果, location -> List(Map(shopname -> 寶安店, qty -> 23, shopid -> s004), Map(shopname -> 龍崗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)
join型別子檔案也可以搜尋其父輩檔案獲取:
val qHasparent= search("fruits").query(
hasParentQuery("fruit",
nestedQuery("location").query(
matchQuery("location.shopname","中心")
),false
)
)
println(s"${qHasparent.show}")
val hasparentResult = client.execute(qHasparent).await
if(hasparentResult.isSuccess)
hasparentResult.result.hits.hits.foreach(m => println(s"${m.sourceAsMap}"))
else println(s"Error: ${hasparentResult.error.causedBy.getOrElse("unknown")}")
...
OST:/fruits/_search?
StringEntity({"query":{"has_parent":{"parent_type":"fruit","query":{"nested":{"path":"location","query":{"match":{"location.shopname":{"query":"中心"}}}}}}}},Some(application/json))
Map(supplier_code -> v001, supplier_name -> 百果園, purchase_date -> 2020-02-09, purchase_history -> Map(name -> purchase, parent -> f003))
Map(supplier_code -> v001, supplier_name -> 百果園, purchase_date -> 2019-10-11, purchase_history -> Map(name -> purchase, parent -> f002))
Map(supplier_code -> v002, supplier_name -> 華南城花果批發市場, purchase_date -> 2020-01-23, purchase_history -> Map(name -> purchase, parent -> f002))
上面這個例子稍微複雜一點:我們想得出所有子檔案,它們的父輩檔案裡嵌入nested檔案包含location.shopname match "中心"。
這些例子主要展示瞭如何通過父子關係的一方取獲取另一方的資料,如:通過子檔案搜尋獲取對應的父檔案或通過父檔案獲取對應的子檔案。也就是說搜尋目標和獲取目標:父子、子父,不是同一種檔案。我們可以通過inner_hits來同時獲取符合搜尋條件的檔案。如nestedQuery.inner():
val qNested = search("fruits").query(
nestedQuery("location").query(
matchQuery("location.shopname","中心")
).inner(InnerHit("locations"))
)
println(s"${qNested.show}")
val nestedResult = client.execute(qNested).await
if(nestedResult.isSuccess) {
nestedResult.result.hits.hits.foreach{ m =>
println(s"${m.sourceAsMap}")
m.innerHits.foreach { i =>
val n = i._1
i._2.hits.foreach(h => println(s"$n, ${h.source}"))
}
}
} else println(s"Error: ${nestedResult.error.causedBy.getOrElse("unknown")}")
...
POST:/fruits/_search?
StringEntity({"query":{"nested":{"path":"location","query":{"match":{"location.shopname":{"query":"中心"}}},"inner_hits":{"name":"locations"}}}},Some(application/json))
HashMap(name -> 東莞荔枝, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 500.0), Map(shopid -> s002, shopname -> 東門店, qty -> 0.0)), price -> 11.5, purchase_history -> fruit, code -> f001)
locations, Map(shopid -> s001, shopname -> 中心店, qty -> 500.0)
HashMap(name -> 進口菲律賓香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龍崗店, qty -> 200.0), Map(shopid -> s002, shopname -> 東門店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
locations, Map(shopid -> s001, shopname -> 中心店, qty -> 300.0)
HashMap(name -> 陝西富士蘋果, location -> List(Map(shopname -> 寶安店, qty -> 23, shopid -> s004), Map(shopname -> 龍崗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)
locations, Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)
hasChildQuery.innerHit():
val qHaschild = search("fruits").query(
hasChildQuery("purchase",
matchQuery("supplier_name","百果")
).innerHit("purchases")
)
println(s"${qHaschild.show}")
val haschildResult = client.execute(qHaschild).await
if(haschildResult.isSuccess) {
haschildResult.result.hits.hits.foreach{m =>
println(s"${m.sourceAsMap}")
m.innerHits.foreach { i =>
val n = i._1
i._2.hits.foreach(h => println(s"$n, ${h.source}"))
}
}
} else println(s"Error: ${haschildResult.error.causedBy.getOrElse("unknown")}")
...
POST:/fruits/_search?
StringEntity({"query":{"has_child":{"type":"purchase","score_mode":"none","query":{"match":{"supplier_name":{"query":"百果"}}},"inner_hits":{"name":"purchases"}}}},Some(application/json))
HashMap(name -> 進口菲律賓香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龍崗店, qty -> 200.0), Map(shopid -> s002, shopname -> 東門店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
purchases, Map(supplier_code -> v001, supplier_name -> 百果園, purchase_date -> 2020-02-09, purchase_history -> Map(name -> purchase, parent -> f003))
HashMap(name -> 陝西富士蘋果, location -> List(Map(shopname -> 寶安店, qty -> 23, shopid -> s004), Map(shopname -> 龍崗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)
purchases, Map(supplier_code -> v001, supplier_name -> 百果園, purchase_date -> 2019-10-11, purchase_history -> Map(name -> purchase, parent -> f002))
purchases, Map(supplier_code -> v002, supplier_name -> 華南城花果批發市場, purchase_date -> 2020-01-23, purchase_history -> Map(name -> purchase, parent -> f002))
hasParentQuery.innerHit():
val qHasparent= search("fruits").query(
hasParentQuery("fruit",
nestedQuery("location").query(
matchQuery("location.shopname","中心")
),false
).innerHit(InnerHit("fruits"))
)
println(s"${qHasparent.show}")
val hasparentResult = client.execute(qHasparent).await
if(hasparentResult.isSuccess) {
hasparentResult.result.hits.hits.foreach{m =>
println(s"${m.sourceAsMap}")
m.innerHits.foreach { i =>
val n = i._1
i._2.hits.foreach(h => println(s"$n, ${h.source}"))
}
}
} else println(s"Error: ${hasparentResult.error.causedBy.getOrElse("unknown")}")
...
POST:/fruits/_search?
StringEntity({"query":{"has_parent":{"parent_type":"fruit","query":{"nested":{"path":"location","query":{"match":{"location.shopname":{"query":"中心"}}}}},"inner_hits":{"name":"fruits"}}}},Some(application/json))
Map(supplier_code -> v001, supplier_name -> 百果園, purchase_date -> 2020-02-09, purchase_history -> Map(name -> purchase, parent -> f003))
fruits, HashMap(name -> 進口菲律賓香蕉, location -> List(Map(shopid -> s001, shopname -> 中心店, qty -> 300.0), Map(shopid -> s003, shopname -> 龍崗店, qty -> 200.0), Map(shopid -> s002, shopname -> 東門店, qty -> 200.0)), price -> 5.3, purchase_history -> fruit, code -> f003)
Map(supplier_code -> v001, supplier_name -> 百果園, purchase_date -> 2019-10-11, purchase_history -> Map(name -> purchase, parent -> f002))
fruits, HashMap(name -> 陝西富士蘋果, location -> List(Map(shopname -> 寶安店, qty -> 23, shopid -> s004), Map(shopname -> 龍崗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)
Map(supplier_code -> v002, supplier_name -> 華南城花果批發市場, purchase_date -> 2020-01-23, purchase_history -> Map(name -> purchase, parent -> f002))
fruits, HashMap(name -> 陝西富士蘋果, location -> List(Map(shopname -> 寶安店, qty -> 23, shopid -> s004), Map(shopname -> 龍崗店, qty -> 200.0, shopid -> s003), Map(shopname -> 中心店, qty -> 300.0, shopid -> s001)), price -> 11.5, purchase_history -> fruit, code -> f002)