java api使用ElastichSearch指南

爬蜥發表於2018-07-16

AggregationBuilders.terms:一段時間內,某個欄位取值的數量排名前幾的聚合

/ **  @param startTime 開始的時間     * @param endTime 結束的時間     * @param termAggName term過濾     * @param fieldName 要做count的欄位     * @param top 返回的數量     */RangeQueryBuilder actionPeriod = QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second");
TermsBuilder termsBuilder = AggregationBuilders.terms(termAggName).field(fieldName).size(top).order(Terms.Order.count(false));
return client.prepareSearch(INDICE).setQuery(actionPeriod).addAggregation(termsBuilder).setSize(0).execute().actionGet();
複製程式碼

order(Terms.Order.count(false)):表示降序

size(top):top表示只要排序的數量

prepareSearch(INDICE):INDICE表示索引的名字

setSize(0):表示只要聚合結果

如果需要去掉某些特殊欄位取值client為構建的ES客戶端

 BoolQueryBuilder actionPeriodMustNot = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second")).mustNot(QueryBuilders.termQuery(field, value));
複製程式碼

如果是單個欄位特定的多個值

//values是個ListBoolQueryBuilder actioPeriodMust = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second")).must(QueryBuilders.termsQuery(field, values));
複製程式碼

使用結果

Terms clickCount= sr.getAggregations().get(termAggName);
for (Terms.Bucket term:clickCount.getBuckets()){
int key = term.getKeyAsNumber().intValue();
//要排序欄位的值 long docCount = term.getDocCount();
//數量
}複製程式碼

date_histogram: 一段時間之內,時間欄位按照時間間隔的聚合

BoolQueryBuilder actioPeriodMust = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second"));
DateHistogramBuilder actionInterval = AggregationBuilders.dateHistogram(dateNickName).field("myTimeField").timeZone("Asia/Shanghai");
if (timeInterval<
MINUTE){
actionTimeInterval.interval(DateHistogramInterval.seconds(timeInterval)).format("HH:mm:ss");

}else if (timeInterval<
HOUR){
actionTimeInterval.interval(DateHistogramInterval.minutes(timeInterval / MINUTE)).format("dd HH:mm");

}else if (timeInterval <
DAY){
actionTimeInterval.interval(DateHistogramInterval.hours(timeInterval / HOUR)).format("HH:mm");

}else if (timeInterval <
THIRTY_DAY){
actionTimeInterval.interval(DateHistogramInterval.days(timeInterval / DAY));

}else{
actionTimeInterval.interval(DateHistogramInterval.MONTH);

}actionInterval.format("yyyy-MM-dd HH:mm:ssZ");
return client.prepareSearch(INDICE).setQuery(actioPeriodMust).addAggregation(actionInterval).setSize(0).execute().actionGet();
複製程式碼

es本身預設設定的時間戳是 UTC形式,在國內要設定TimeZone(“Asia/Shanghai”);

java的SimpleDateFormate會預設獲取虛擬機器所在時區的時間戳,所以存時間的時候,最好存與時區無關的時間,再做本地化顯示

使用結果

Histogram histogram=sr.getAggregations().get(dateNickName);
for(Histogram.Bucket entry:histogram.getBuckets()){
String key = entry.getKeyAsString();
//時間間隔 long count = entry.getDocCount();
//數量
}複製程式碼

subAggregation:一段時間內,按照一定的時間間隔,每個間隔段內欄位每個取值的數量聚合

相當於合併上述兩個場景

BoolQueryBuilder query = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second"))  .must(QueryBuilders.termsQuery("action", orderValue));
DateHistogramBuilder actionTimeInterval = AggregationBuilders.dateHistogram(dateNickName).field("myTimeField").timeZone("Asia/Shanghai");
actionTimeInterval.subAggregation(AggregationBuilders.terms(termNickName).field("action").size(size));
return client.prepareSearch(INDICE).setQuery(query).addAggregation(actionTimeInterval).setSize(0).execute().actionGet();
複製程式碼

使用結果

Histogram hitogram = sr.getAggregations().get(dateAggName);
for (Histogram.Bucket date : hitogram.getBuckets()) {
String intervalName = date.getKeyAsString();
long timeIntervalCount = date.getDocCount();
if (timeIntervalCount != 0) {
Terms terms = date.getAggregations().get(termAggName);
for (Terms.Bucket entry : terms.getBuckets()) {
int key= entry.getKeyAsNumber().intValue();
long childCount = entry.getDocCount();

}
}
}複製程式碼

分頁獲取資料

BoolQueryBuilder actionPeriodMust = QueryBuilders.boolQuery().must(QueryBuilders.termQuery(key, value)).must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second"));
return client.prepareSearch(INDICE).setQuery(actionPeriodMust).addSort(SortBuilders.fieldSort("myTimeField").order(SortOrder.ASC)).setFrom(from).setSize(size).execute().actionGet();
複製程式碼

使用

Iterator<
SearchHit>
iterator = sr.getHits().iterator();
while (iterator.hasNext()) {
SearchHit next = iterator.next();
JSONObject jo = JSONObject.parseObject(next.getSourceAsString());

}複製程式碼

AggregationBuilders.cardinality:獲取某個欄位的唯一取值數量

BoolQueryBuilder query = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTimeInSec*1000).lte(endTimeInSec*1000).format("epoch_millis"));
CardinalityBuilder fieldCardinality = AggregationBuilders.cardinality(cardinalityAggName).field(field);
//field 要獲取的欄位return client.prepareSearch(INDICE).setQuery(query).addAggregation(fieldCardinality).execute().actionGet();
複製程式碼

使用結果

Cardinality cardinality = sr.getAggregations().get(cardinalityAggName);
long value = cardinality.getValue();
複製程式碼

bool查詢

比如想要addr是beijing的,同時必須滿足條件:name是 paxi,或者,phoneNumber是 1234567890

BoolQueryBuilder searchIdQuery = QueryBuilders.boolQuery();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
while (kvs.hasNext()){
Map.Entry<
String, String>
fieldValue = kvs.next();
String field=fieldValue.getKey();
String value=fieldValue.getValue();
searchIdQuery.should(QueryBuilders.termQuery(field, value));

}boolQueryBuilder.must(searchIdQuery);
boolQueryBuilder.must(QueryBuilders.termsQuery(key, values));
return client.prepareSearch(INDICE).setQuery(boolQueryBuilder).execute().actionGet();
複製程式碼

來源:https://juejin.im/post/5b4c8559e51d45190a43089c

相關文章