ElasticSearch Java API使用

社會我浩哥發表於2019-03-02

感謝全科的ElasticSearch講解,大部分來源於此

ElasticSearch

MySQL與ElasticSearch的比較

MySQL ElasticSearch
Database(資料庫) Index(索引)
Table(表) Type(型別)
Row(行) Document(文件)
Column(列) Field(欄位)
Schema(方案) Mapping(對映)
Index(索引) Everthing Indexed by default(所有欄位都被索引)
SQL(結構化查詢語言) Query DSL(查詢專用語言)

Document APIs

Index API

Index API 允許我們儲存一個JSON格式的文件,使得資料可以被搜尋到。文件通過index、type、id唯一確定。id可以自己提供一個ID,也可以使用Index API為我們生成一個。

有四種不同的方式來產生JSON格式的文件(document)

  • 手動方式,使用原生的byte[]或者String
  • 使用Map方式,會自動轉換成與之等價的JSON
  • 使用第三方庫來生成序列化beans,如JackJSON、FastJSON等
  • 使用內建的幫助類XContentFactory.jsonBuilder()

手動方式

    /**
     * 手動方式
     * @throws UnknownHostException
     */
    @Test
    public void JsonDocument() throws UnknownHostException {
        String json = "{" +
                ""user":"deepredapple"," +
                ""postDate":"2018-01-30"," +
                ""message":"trying out Elasticsearch"" +
                "}";
        IndexResponse indexResponse = client.prepareIndex("fendo", "fendodate").setSource(json).get();
        System.out.println(indexResponse.getResult());
    }
複製程式碼

Map方式

    /**
     * Map方式
     */
    @Test
    public void MapDocument() {
        Map<String, Object> json = new HashMap<String, Object>();
        json.put("user", "hhh");
        json.put("postDate", "2018-06-28");
        json.put("message", "trying out Elasticsearch");
        IndexResponse response = client.prepareIndex(INDEX, TYPE).setSource(json).get();
        System.out.println(response.getResult());
    }
複製程式碼

序列化方式

    /**
     * 使用JACKSON序列化
     */
    @Test
    public void JACKSONDocument() throws JsonProcessingException {
        Blog blog = new Blog();
        blog.setUser("123");
        blog.setPostDate("2018-06-29");
        blog.setMessage("try out ElasticSearch");

        ObjectMapper mapper = new ObjectMapper();
        byte[] bytes = mapper.writeValueAsBytes(blog);
        IndexResponse response = client.prepareIndex(INDEX, TYPE).setSource(bytes).get();
        System.out.println(response.getResult());
    }
複製程式碼

XContentBuilder幫助類方式

    /**
     * 使用XContentBuilder幫助類方式
     */
    @Test
    public void XContentBuilderDocument() throws IOException {
        XContentBuilder builder = XContentFactory.jsonBuilder().startObject()
                .field("user", "xcontentdocument")
                .field("postDate", "2018-06-30")
                .field("message", "this is ElasticSearch").endObject();
        IndexResponse response = client.prepareIndex(INDEX, TYPE).setSource(builder).get();
        System.out.println(response.getResult());
    }
複製程式碼

綜合示例

package com.deepredapple.es.document;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.XContentFactory;
import org.elasticsearch.transport.client.PreBuiltTransportClient;
import org.junit.Before;
import org.junit.Test;

import java.io.IOException;
import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.Date;
import java.util.HashMap;
import java.util.Map;

/**
 * @author DeepRedApple
 */
public class TestClient {

    TransportClient client = null;

    public static final String INDEX = "fendo";

    public static final String TYPE = "fendodate";

    @Before
    public void beforeClient() throws UnknownHostException {
        client = new PreBuiltTransportClient(Settings.EMPTY)
                .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"), 9300));
    }

    /**
     * 手動方式
     * @throws UnknownHostException
     */
    @Test
    public void JsonDocument() throws UnknownHostException {
        String json = "{" +
                ""user":"deepredapple"," +
                ""postDate":"2018-01-30"," +
                ""message":"trying out Elasticsearch"" +
                "}";
        IndexResponse indexResponse = client.prepareIndex(INDEX, TYPE).setSource(json).get();
        System.out.println(indexResponse.getResult());
    }

    /**
     * Map方式
     */
    @Test
    public void MapDocument() {
        Map<String, Object> json = new HashMap<String, Object>();
        json.put("user", "hhh");
        json.put("postDate", "2018-06-28");
        json.put("message", "trying out Elasticsearch");
        IndexResponse response = client.prepareIndex(INDEX, TYPE).setSource(json).get();
        System.out.println(response.getResult());
    }

    /**
     * 使用JACKSON序列化
     */
    @Test
    public void JACKSONDocument() throws JsonProcessingException {
        Blog blog = new Blog();
        blog.setUser("123");
        blog.setPostDate("2018-06-29");
        blog.setMessage("try out ElasticSearch");

        ObjectMapper mapper = new ObjectMapper();
        byte[] bytes = mapper.writeValueAsBytes(blog);
        IndexResponse response = client.prepareIndex(INDEX, TYPE).setSource(bytes).get();
        System.out.println(response.getResult());
    }

    /**
     * 使用XContentBuilder幫助類方式
     */
    @Test
    public void XContentBuilderDocument() throws IOException {
        XContentBuilder builder = XContentFactory.jsonBuilder().startObject()
                .field("user", "xcontentdocument")
                .field("postDate", "2018-06-30")
                .field("message", "this is ElasticSearch").endObject();
        IndexResponse response = client.prepareIndex(INDEX, TYPE).setSource(builder).get();
        System.out.println(response.getResult());
    }

}
複製程式碼

Get API

get API 可以通過id檢視文件

GetResponse getResponse = client.prepareGet(INDEX, TYPE, "AWRJCXMhro3r8sDxIpir").get();
複製程式碼

引數分別為索引、型別、_id

配置執行緒

setOperationThreaded設定為true是在不同的執行緒裡執行此操作

    /**
     * Get API
     */
    @Test
    public void testGetApi() {
        GetResponse getResponse = client.prepareGet(INDEX, TYPE, "AWRJCXMhro3r8sDxIpir").setOperationThreaded(false).get();
        Map<String, Object> map = getResponse.getSource();
        Set<String> keySet = map.keySet();
        for (String str : keySet) {
            Object o = map.get(str);
            System.out.println(o.toString());
        }
    }
複製程式碼

Delete API

根據ID刪除:

DeleteResponse deleteResponse = client.prepareDelete(INDEX, TYPE, "AWRJCXMhro3r8sDxIpir").get();
複製程式碼

引數為索引、型別、_id

配置執行緒

setOperationThreaded設定為true是在不同的執行緒裡執行此操作

    /**
     * deleteAPI
     */
    @Test
    public void testDeleteAPI() {
        GetResponse getResponse = client.prepareGet(INDEX, TYPE, "AWRJCXMhro3r8sDxIpir").setOperationThreaded(false).get();
        System.out.println(getResponse.getSource());
        DeleteResponse deleteResponse = client.prepareDelete(INDEX, TYPE, "AWRJCXMhro3r8sDxIpir").get();
        System.out.println(deleteResponse.getResult());
    }
複製程式碼

Delete By Query API

通過查詢條件刪除

    /**
     * 通過查詢條件刪除
     */
    @Test
    public void deleteByQuery() {
        BulkByScrollResponse response = DeleteByQueryAction.INSTANCE.newRequestBuilder(client)
                .filter(QueryBuilders.matchQuery("user", "hhh")) //查詢條件
                .source(INDEX).get();//索引名
        long deleted = response.getDeleted();//刪除文件數量
        System.out.println(deleted);
    }
複製程式碼

引數說明 QueryBuilders.matchQuery(“user”, “hhh”) 的引數為欄位和查詢條件,source(INDEX)引數為索引名

非同步回撥

當執行的刪除的時間過長時,可以使用非同步回撥的方式執行刪除操作,執行的結果在回撥裡面獲取

    /**
     * 回撥的方式執行刪除 適合大資料量的刪除操作
     */
    @Test
    public void DeleteByQueryAsync() {
        for (int i = 1300; i < 3000; i++) {
            DeleteByQueryAction.INSTANCE.newRequestBuilder(client)
                .filter(QueryBuilders.matchQuery("user", "hhh " + i))
                .source(INDEX)
                .execute(new ActionListener<BulkByScrollResponse>() {
                    public void onResponse(BulkByScrollResponse response) {
                        long deleted = response.getDeleted();
                        System.out.println("刪除的文件數量為= "+deleted);
                    }

                    public void onFailure(Exception e) {
                        System.out.println("Failure");
                    }
                });
        }
    }
複製程式碼

當程式停止時,在ElasticSearch的控制檯依舊在執行刪除操作,非同步的執行操作

監聽回撥方法是execute方法

    .execute(new ActionListener<BulkByScrollResponse>() { //回撥方法
      public void onResponse(BulkByScrollResponse response) {
        long deleted = response.getDeleted();
        System.out.println("刪除的文件數量為= "+deleted);
      }

      public void onFailure(Exception e) {
        System.out.println("Failure");
      }
    });
複製程式碼

Update API

更新索引

主要有兩種方法進行更新操作

  • 建立UpdateRequest,通過client傳送
  • 使用prepareUpdate()方法。

使用UpdateRequest

    /**
     * 使用UpdateRequest進行更新
     */
    @Test
    public void testUpdateAPI() throws IOException, ExecutionException, InterruptedException {
        UpdateRequest updateRequest = new UpdateRequest();
        updateRequest.index(INDEX);
        updateRequest.type(TYPE);
        updateRequest.id("AWRFv-yAro3r8sDxIpib");
        updateRequest.doc(jsonBuilder()
                .startObject()
                    .field("user", "hhh")
                .endObject());
        client.update(updateRequest).get();
    }
複製程式碼

使用prepareUpdate()

    /**
     * 使用PrepareUpdate
     */
    @Test
    public void testUpdatePrepareUpdate() throws IOException {
        client.prepareUpdate(INDEX, TYPE, "AWRFvA7k0udstXU4tl60")
                .setScript(new Script("ctx._source.user = "DeepRedApple"")).get();
        client.prepareUpdate(INDEX, TYPE, "AWRFvA7k0udstXU4tl60")
                .setDoc(jsonBuilder()
                .startObject()
                    .field("user", "DeepRedApple")
                .endObject()).get();
    }
複製程式碼

client.prepareUpdate中的setScript方法不同的版本的引數不同,這裡直接傳入值,也可以直接插入檔案儲存的指令碼,然後直接執行指令碼里面的資料進行更新操作。

Update By Script

使用指令碼更新文件

    /**
     * 通過指令碼更新
     */
    @Test
    public void testUpdateByScript() throws ExecutionException, InterruptedException {
        UpdateRequest updateRequest = new UpdateRequest(INDEX, TYPE, "AWRFvLSTro3r8sDxIpia")
                .script(new Script("ctx._source.user = "LZH""));
        client.update(updateRequest).get();
    }
複製程式碼

Upsert

更新文件,如果存在文件就更新,如果不存在就插入

    /**
     * 更新文件 如果存在更新,否則插入
     */
    @Test
    public void testUpsert() throws IOException, ExecutionException, InterruptedException {
        IndexRequest indexRequest = new IndexRequest(INDEX, TYPE, "AWRFvLSTro3r8sDxIp12")
                .source(jsonBuilder()
                    .startObject()
                        .field("user", "hhh")
                        .field("postDate", "2018-02-14")
                        .field("message", "ElasticSearch")
                    .endObject());
        UpdateRequest updateRequest = new UpdateRequest(INDEX, TYPE, "AWRFvLSTro3r8sDxIp12")
                .doc(jsonBuilder()
                    .startObject()
                        .field("user", "LZH")
                    .endObject())
                .upsert(indexRequest); //如果不存在,就增加indexRequest
        client.update(updateRequest).get();
    }
複製程式碼

如果引數中的_id存在,即index/type/_id存在,那麼就會執行UpdateRequest,如果index/type/_id不存在,那麼就直接插入

Multi Get API

一次獲取多個文件,

    /**
     * 一次獲取多個文件
     */
    @Test
    public void TestMultiGetApi() {
        MultiGetResponse responses = client.prepareMultiGet()
                .add(INDEX, TYPE, "AWRFv-yAro3r8sDxIpib") //一個ID的方式
                .add(INDEX, TYPE, "AWRFvA7k0udstXU4tl60", "AWRJA72Uro3r8sDxIpip")//多個ID的方式
                .add("blog", "blog", "AWG9GKCwhg1e21lmGSLH") //從另一個索引裡面獲取
                .get();
        for (MultiGetItemResponse itemResponse : responses) {
            GetResponse response = itemResponse.getResponse();
            if (response.isExists()) {
                String source = response.getSourceAsString(); //_source
                JSONObject jsonObject = JSON.parseObject(source);
                Set<String> sets = jsonObject.keySet();
                for (String str : sets) {
                    System.out.println("key -> " + str);
                    System.out.println("value -> "+jsonObject.get(str));
                    System.out.println("===============");
                }
            }
        }
    }
複製程式碼

Bulk API

Buli API 可以實現批量插入

    /**
     * 批量插入
     */
    @Test
    public void testBulkApi() throws IOException {
        BulkRequestBuilder requestBuilder = client.prepareBulk();
        requestBuilder.add(client.prepareIndex(INDEX, TYPE, "1")
            .setSource(jsonBuilder()
                .startObject()
                    .field("user", "張三")
                    .field("postDate", "2018-05-01")
                    .field("message", "zhangSan message")
                .endObject()));
        requestBuilder.add(client.prepareIndex(INDEX, TYPE, "2")
            .setSource(jsonBuilder()
                .startObject()
                    .field("user", "李四")
                    .field("postDate", "2016-09-10")
                    .field("message", "Lisi message")
                .endObject()));
        BulkResponse bulkResponse = requestBuilder.get();
        if (bulkResponse.hasFailures()) {
            System.out.println("error");
        }
    }
複製程式碼

Useing Bulk Processor

使用Bulk Processor,Bulk Processor提供了一個簡單的介面,在給定的大小的數量上定時批量自動請求

建立Bulk Processor例項

首先建立Bulk Processor例項

    /**
     * 建立Processor例項
     */
    @Test
    public void testCreateBulkProcessor() {
        BulkProcessor bulkProcessor = BulkProcessor.builder(client, new BulkProcessor.Listener() {
            //呼叫Bulk之前執行,例如可以通過request.numberOfActions()方法知道numberOfActions
            public void beforeBulk(long l, BulkRequest request) {
                
            }

            //呼叫Bulk之後執行,例如可以通過response.hasFailures()方法知道是否執行失敗
            public void afterBulk(long l, BulkRequest request, BulkResponse response) {
                
            }

            //呼叫失敗丟擲throwable
            public void afterBulk(long l, BulkRequest bulkRequest, Throwable throwable) {

            }
        }).setBulkActions(10000) //每次10000個請求
          .setBulkSize(new ByteSizeValue(5, ByteSizeUnit.MB)) //拆成5MB一塊
          .setFlushInterval(TimeValue.timeValueSeconds(5))//無論請求數量多少,每5秒鐘請求一次
          .setConcurrentRequests(1)//設定併發請求的數量。值為0意味著只允許執行一個請求。值為1意味著允許1併發請求
          .setBackoffPolicy(
                  BackoffPolicy.exponentialBackoff(TimeValue.timeValueMillis(100), 3))
                //設定自定義重複請求機制,最開始等待100毫秒,之後成倍增加,重試3次,當一次或者多次重複請求失敗後因為計算資源不夠丟擲EsRejectedExecutionException
                // 異常,可以通過BackoffPolicy.noBackoff()方法關閉重試機制
          .build();
    }
複製程式碼

BulkProcess預設設計

  • bulkActions 1000
  • bulkSize 5mb
  • 不設定flushInterval
  • concurrentRequests為1,非同步執行
  • backoffPolicy重試8次,等待50毫秒
    /**
     * 建立Processor例項
     */
    @Test
    public void testCreateBulkProcessor() throws IOException {
        BulkProcessor bulkProcessor = BulkProcessor.builder(client, new BulkProcessor.Listener() {
            //呼叫Bulk之前執行,例如可以通過request.numberOfActions()方法知道numberOfActions
            public void beforeBulk(long l, BulkRequest request) {

            }

            //呼叫Bulk之後執行,例如可以通過response.hasFailures()方法知道是否執行失敗
            public void afterBulk(long l, BulkRequest request, BulkResponse response) {

            }

            //呼叫失敗丟擲throwable
            public void afterBulk(long l, BulkRequest bulkRequest, Throwable throwable) {

            }
        }).setBulkActions(10000) //每次10000個請求
          .setBulkSize(new ByteSizeValue(5, ByteSizeUnit.MB)) //拆成5MB一塊
          .setFlushInterval(TimeValue.timeValueSeconds(5))//無論請求數量多少,每5秒鐘請求一次
          .setConcurrentRequests(1)//設定併發請求的數量。值為0意味著只允許執行一個請求。值為1意味著允許1併發請求
          .setBackoffPolicy(
                  BackoffPolicy.exponentialBackoff(TimeValue.timeValueMillis(100), 3))
                //設定自定義重複請求機制,最開始等待100毫秒,之後成倍增加,重試3次,當一次或者多次重複請求失敗後因為計算資源不夠丟擲EsRejectedExecutionException
                // 異常,可以通過BackoffPolicy.noBackoff()方法關閉重試機制
          .build();

        //增加requests
        bulkProcessor.add(new IndexRequest(INDEX, TYPE, "3").source(
                jsonBuilder()
                        .startObject()
                            .field("user", "王五")
                            .field("postDate", "2019-10-05")
                            .field("message", "wangwu message")
                        .endObject()));
        bulkProcessor.add(new DeleteRequest(INDEX, TYPE, "1"));
        bulkProcessor.flush();
        //關閉bulkProcessor
        bulkProcessor.close();
        client.admin().indices().prepareRefresh().get();
        client.prepareSearch().get();
    }
複製程式碼

Search API

搜尋API可以支援搜尋查詢,返回查詢匹配的結果,它可以搜尋一個index/type或者多個index/type,可以使用Query Java API 作為查詢條件

Java 預設提供QUERY_AND_FETCH和DFS_QUERY_AND_FETCH兩種search Types,但是這種模式應該由系統選擇,而不是使用者手動指定

​ 例項

    @Test
    public void testSearchApi() {
        SearchResponse response = client.prepareSearch(INDEX).setTypes(TYPE)
                .setQuery(QueryBuilders.matchQuery("user", "hhh")).get();
        SearchHit[] hits = response.getHits().getHits();
        for (int i = 0; i < hits.length; i++) {
            String json = hits[i].getSourceAsString();
            JSONObject object = JSON.parseObject(json);
            Set<String> strings = object.keySet();
            for (String str : strings) {
                System.out.println(object.get(str));
            }
        }
    }
複製程式碼

Using scrolls in Java

一般的搜尋請求都時返回一頁的資料,無論多大的資料量都會返回給使用者,Scrolls API 可以允許我們檢索大量的資料(甚至是全部資料)。Scroll API允許我們做一個初始階段搜尋頁並且持續批量從ElasticSearch裡面拉去結果知道結果沒有剩下。Scroll API的建立並不是為了實時的使用者響應,而是為了處理大量的資料。

  /**
   * 滾動查詢 
   * @throws ExecutionException
   * @throws InterruptedException
   */
  @Test
  public void testScrollApi() throws ExecutionException, InterruptedException {
      MatchQueryBuilder qb = matchQuery("user", "hhh");
      SearchResponse response = client.prepareSearch(INDEX).addSort(FieldSortBuilder.DOC_FIELD_NAME,
              SortOrder.ASC)
              .setScroll(new TimeValue(60000)) //為了使用scroll,初始搜尋請求應該在查詢中指定scroll引數,告訴ElasticSearch需要保持搜尋的上下文環境多長時間
              .setQuery(qb)
              .setSize(100).get();
      do {
          for (SearchHit hit : response.getHits().getHits()) {
              String json = hit.getSourceAsString();
              JSONObject object = JSON.parseObject(json);
              Set<String> strings = object.keySet();
              for (String str : strings) {
                  System.out.println(object.get(str));
              }
          }
          response = client.prepareSearchScroll(response.getScrollId()).setScroll(new TimeValue(60000)).execute().get();
      } while (response.getHits().getHits().length != 0);
  }
複製程式碼

如果超過滾動時間,繼續使用該滾動ID搜尋資料,則會報錯

雖然滾動時間已過,搜尋上下文會自動被清除,但是一直保持滾動代價會很大,所以當我們不在使用滾動時要儘快使用Clear-Scroll API進行清除。

清除滾動ID

        ClearScrollRequestBuilder clearBuilder = client.prepareClearScroll();
        clearBuilder.addScrollId(response.getScrollId());
        ClearScrollResponse scrollResponse = clearBuilder.get();
        System.out.println("是否清楚成功:"+scrollResponse.isSucceeded());
複製程式碼

MultiSearch API

MultiSearch API 允許在同一個API中執行多個搜尋請求。它的端點是_msearch

    @Test
    public void testMultiSearchApi() {
        SearchRequestBuilder srb1 = client.prepareSearch().setQuery(QueryBuilders.queryStringQuery("elasticsearch")).setSize(1);
        SearchRequestBuilder srb2 = client.prepareSearch().setQuery(QueryBuilders.matchQuery("user", "hhh")).setSize(1);
        MultiSearchResponse multiSearchResponse = client.prepareMultiSearch().add(srb1).add(srb2).get();
        long nbHits = 0;
        for (MultiSearchResponse.Item item : multiSearchResponse.getResponses()) {
            SearchResponse response = item.getResponse();
            nbHits += response.getHits().getTotalHits();
        }
        System.out.println(nbHits);
    }
複製程式碼

Using Aggregations

聚合框架有助於根據搜尋查詢提供資料。它是基於簡單的構建塊也稱為整合,整合就是將複雜的資料摘要有序的放在一塊。聚合可以被看做是從一組檔案中獲取分析資訊的一系列工作的統稱。聚合的實現過程就是定義這個文件集的過程

    @Test
    public void testAggregations() {
        SearchResponse searchResponse = client.prepareSearch()
                .setQuery(QueryBuilders.matchAllQuery())
                .addAggregation(AggregationBuilders.terms("LZH").field("user"))
                .addAggregation(AggregationBuilders.dateHistogram("2013-01-30").field("postDate")
                        .dateHistogramInterval(DateHistogramInterval.YEAR)).get();
        Terms lzh = searchResponse.getAggregations().get("user");
        Histogram postDate = searchResponse.getAggregations().get("2013-01-30");

    }
複製程式碼

Terminate After

獲取文件的最大數量,如果設定了,需要通過SearchResponse物件裡面的isTerminatedEarly()判斷返回文件是否達到設定的數量

    @Test
    public void TestTerminate() {
        SearchResponse searchResponse = client.prepareSearch(INDEX)
                .setTerminateAfter(2) //如果達到這個數量,提前終止
                .get();
        if (searchResponse.isTerminatedEarly()) {
            System.out.println(searchResponse.getHits().totalHits);
        }
    }
複製程式碼

Aggregations

聚合。ElasticSearch提供完整的Java API來使用聚合。使用AggregationBuilders構建物件,增加到搜尋請求中。

SearchResponse response = client.prepareSearch().setQuery(/*查詢*/).addAggregation(/*聚合*/).execute().actionGet();
複製程式碼

Structuring aggregations

結構化聚合。

Metrics aggregations

在計算度量類的這類聚合操作是以使用一種方式或者從文件中提取需要聚合的值為基礎。

在這中間主要使用的類是**AggregationBuilders**,這裡麵包含了大量的一下的聚合方法呼叫,直接使用即可

Min Aggregation最小聚合

    MinAggregationBuilder aggregation = AggregationBuilders.min("agg").field("age");

    SearchResponse sr = client.prepareSearch("twitter").addAggregation(aggregation).get();
    Min agg = sr.getAggregations().get("agg");
    String value = agg.getValueAsString();//這個統計的是日期,一般用下面方法獲得最小值
    System.out.println("min value:" + value);
複製程式碼

debug模式下

第一行MinAggregationBuilder的toString()執行的內容如下

{
  "error": "JsonGenerationException[Can not write a field name, expecting a value]"
}
複製程式碼
SearchResponse sr = client.prepareSearch("twitter").addAggregation(aggregation).get();
複製程式碼

在SearchResponse的toString()的內容如下, 這個內容就是查詢的JSON結果,這裡面的JSON結果的結構與SearchResponse的API操作相配套使用可以獲取到裡面的每一個值。

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1.0,
    "hits": [
      {
        "_index": "twitter",
        "_type": "tweet",
        "_id": "10",
        "_score": 1.0,
        "_source": {
          "user": "kimchy",
          "postDate": "2018-06-29T09:10:21.396Z",
          "age": 30,
          "gender": "female",
          "message": "trying out Elasticsearch"
        }
      },
      {
        "_index": "twitter",
        "_type": "tweet",
        "_id": "2",
        "_score": 1.0,
        "_source": {
          "user": "kimchy",
          "postDate": "2018-06-29T09:05:33.943Z",
          "age": 20,
          "gender": "female",
          "message": "trying out Elasticsearch"
        }
      },
      {
        "_index": "twitter",
        "_type": "tweet",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "user": "kimchy",
          "postDate": "2018-06-29T08:59:00.191Z",
          "age": 10,
          "gender": "male",
          "message": "trying out Elasticsearch"
        }
      },
      {
        "_index": "twitter",
        "_type": "tweet",
        "_id": "11",
        "_score": 1.0,
        "_source": {
          "user": "kimchy",
          "postDate": "2018-06-29T09:10:54.386Z",
          "age": 30,
          "gender": "female",
          "message": "trying out Elasticsearch"
        }
      }
    ]
  },
  "aggregations": {
    "agg": {
      "value": 10.0
    }
  }
}
複製程式碼

通過觀察可以發現sr.getAggregations().get("agg");方法就是獲取其中的聚合統計的資料,其中整個程式碼中的引數agg可以自定義

Max Aggregation最大聚合

    MaxAggregationBuilder aggregation = AggregationBuilders.max("agg").field("readSize");

    SearchResponse sr = client.prepareSearch("blog").addAggregation(aggregation).get();
    Max agg = sr.getAggregations().get("agg");
    String value = agg.getValueAsString();

    System.out.println("max value:" + value);
複製程式碼

具體分析方法如Min Aggregation聚合一樣,但是不能統計出是哪一條資料的最大最小值

Sum Aggregation求和聚合

    SumAggregationBuilder aggregation = AggregationBuilders.sum("agg").field("readSize");

    SearchResponse sr = client.prepareSearch("blog").addAggregation(aggregation).get();
    Sum agg = sr.getAggregations().get("agg");
    String value = agg.getValueAsString();

    System.out.println("sum value:" + value);
複製程式碼

Avg Aggregation平均值聚合

AvgAggregationBuilder aggregation = AggregationBuilders.avg("agg").field("age");
SearchResponse searchResponse = client.prepareSearch("twitter").addAggregation(aggregation).get();
Avg avg = searchResponse.getAggregations().get("agg");
String value = avg.getValueAsString();
System.out.println("avg value: "+ value);
複製程式碼

Stats Aggreagtin統計聚合

統計聚合——基於文件的某個值,計算出一些統計資訊(min、max、sum、count、avg), 用於計算的值可以是特定的數值型欄位,也可以通過指令碼計算而來。

        StatsAggregationBuilder aggregation = AggregationBuilders.stats("agg").field("age");
        SearchResponse searchResponse = client.prepareSearch("twitter").addAggregation(aggregation).get();
        Stats stats = searchResponse.getAggregations().get("agg");
        String max = stats.getMaxAsString();
        String min = stats.getMinAsString();
        String avg = stats.getAvgAsString();
        String sum = stats.getSumAsString();
        long count = stats.getCount();
        System.out.println("max value: "+max);
        System.out.println("min value: "+min);
        System.out.println("avg value: "+avg);
        System.out.println("sum value: "+sum);
        System.out.println("count value: "+count);
複製程式碼

這個聚合統計可以統計出上面的平常的統計值。當需要統計上面的大部分的值時,可以使用這種方式

Extended Stats Aggregation擴充套件統計聚合

擴充套件統計聚合——基於文件的某個值,計算出一些統計資訊(比普通的stats聚合多了sum_of_squares、variance、std_deviation、std_deviation_bounds),用於計算的值可以是特定的數值型欄位,也可以通過指令碼計算而來。主要的結果值就是最大、最小、方差、平方差等統計值

        ExtendedStatsAggregationBuilder aggregation = AggregationBuilders.extendedStats("agg").field("age");
        SearchResponse response = client.prepareSearch("twitter").addAggregation(aggregation).get();
        ExtendedStats extended = response.getAggregations().get("agg");
        String max = extended.getMaxAsString();
        String min = extended.getMinAsString();
        String avg = extended.getAvgAsString();
        String sum = extended.getSumAsString();
        long count = extended.getCount();
        double stdDeviation = extended.getStdDeviation();
        double sumOfSquares = extended.getSumOfSquares();
        double variance = extended.getVariance();
        System.out.println("max value: " +max);
        System.out.println("min value: " +min);
        System.out.println("avg value: " +avg);
        System.out.println("sum value: " +sum);
        System.out.println("count value: " +count);
        System.out.println("stdDeviation value: " +stdDeviation);
        System.out.println("sumOfSquares value: " +sumOfSquares);
        System.out.println("variance value: "+variance);
複製程式碼

Value Count Aggregation值計數聚合

值計數聚合——計算聚合文件中某個值的個數, 用於計算的值可以是特定的數值型欄位,也可以通過指令碼計算而來。該聚合一般域其它 single-value 聚合聯合使用,比如在計算一個欄位的平均值的時候,可能還會關注這個平均值是由多少個值計算而來。

ValueCountAggregationBuilder aggregation = AggregationBuilders.count("agg").field("age");
SearchResponse response = client.prepareSearch("twitter").addAggregation(aggregation).get();
ValueCount count = response.getAggregations().get("agg");
long value = count.getValue();
System.out.println("ValueCount value: " +value);
複製程式碼

Precentile Aggregation百分百聚合

    PercentilesAggregationBuilder aggregation = AggregationBuilders.percentiles("agg").field("age");
    SearchResponse response = client.prepareSearch("twitter").addAggregation(aggregation).get();
    Percentiles agg = response.getAggregations().get("agg");
    for (Percentile entry : agg) {
        double percent = entry.getPercent();
        double value = entry.getValue();
        System.out.println("percent value: " + percent + "value value: " + value);
    }
複製程式碼

Cardinality Aggreagion基數聚合

去除重複的個數的基數

CardinalityAggregationBuilder aggregation = AggregationBuilders.cardinality("agg").field("age");
SearchResponse response = client.prepareSearch("twitter").addAggregation(aggregation).get();
Cardinality agg = response.getAggregations().get("agg");
long value = agg.getValue();
System.out.println("value value: "+ value);
複製程式碼

Top Hits Aggregation最高匹配權值聚合

查詢出匹配的文件的欄位的個數

TermsAggregationBuilder aggregation = AggregationBuilders.terms("agg").field("gender.keyword")
.subAggregation(AggregationBuilders.topHits("top").explain(true).size(1).from(10));
SearchResponse response = client.prepareSearch("twitter").addAggregation(aggregation).get();
Terms agg = response.getAggregations().get("agg");
        for (Terms.Bucket bucket : agg.getBuckets()) {
            String key = (String) bucket.getKey();
            long docCount = bucket.getDocCount();
            System.out.println("key value: " + key + " docCount value: " + docCount);
            TopHits topHits = bucket.getAggregations().get("top");
            for (SearchHit searchHitFields : topHits.getHits().getHits()) {
                System.out.println("id value: " + searchHitFields.getId() + " source value: " + searchHitFields.getSourceAsString());
            }
        }
複製程式碼

Bucket aggregations

Global Aggregation全域性聚合

查詢全域性的一個數量統計

        AggregationBuilder aggregation = AggregationBuilders
                .global("agg")
                .subAggregation(
                        AggregationBuilders.terms("users").field("user.keyword")
                );

        SearchResponse sr = client.prepareSearch("twitter")
                .addAggregation(aggregation)
                .get();
        System.out.println(sr);
        Global agg = sr.getAggregations().get("agg");
        long count = agg.getDocCount(); // Doc count

        System.out.println("global count:" + count);
複製程式碼

Filter Aggreagion過濾聚合

過濾統計

AggregationBuilder aggregation = AggregationBuilders.filters("aaa", new FiltersAggregator.KeyedFilter("men", QueryBuilders.termQuery("gender", "male")));

SearchResponse sr = client.prepareSearch("twitter").setTypes("tweet").addAggregation(aggregation).get();
Filters agg = sr.getAggregations().get("aaa");
for (Filters.Bucket entry : agg.getBuckets()) {
  String key = entry.getKeyAsString();            // bucket key
  long docCount = entry.getDocCount();            // Doc count

  System.out.println("global " + key + " count:" + docCount);
}
複製程式碼

Filters Aggregation多過濾聚合

多個條件過濾,查詢出個數

AggregationBuilder aggregation = AggregationBuilders.filters("aaa",new FiltersAggregator.KeyedFilter("men", QueryBuilders.termQuery("gender", "male")),new FiltersAggregator.KeyedFilter("women", QueryBuilders.termQuery("gender", "female")));

SearchResponse sr = client.prepareSearch("twitter").setTypes("tweet").addAggregation(aggregation).get();
Filters agg = sr.getAggregations().get("aaa");
for (Filters.Bucket entry : agg.getBuckets()) {
  String key = entry.getKeyAsString();            // bucket key
  long docCount = entry.getDocCount();            // Doc count

  System.out.println("global " + key + " count:" + docCount);
}
複製程式碼

Missing Aggregation基於欄位資料的單桶聚合

Nested Aggregation巢狀型別聚合

Reverse nested Aggregation

Children Aggregation

Terms Aggregation詞元聚合

        TermsAggregationBuilder fieldAggregation = AggregationBuilders.terms("genders").field("gender.keyword")
                .order(Terms.Order.term(true));
        SearchResponse response = client.prepareSearch("twitter").setTypes("tweet").addAggregation(fieldAggregation).get();

        Terms terms = response.getAggregations().get("genders");
        for (Terms.Bucket bucket : terms.getBuckets()) {
            System.out.println("key value: " + bucket.getKey());
            System.out.println("docCount value: " + bucket.getDocCount());
        }
複製程式碼

Order排序

        TermsAggregationBuilder fieldAggregation = AggregationBuilders.terms("genders").field("gender.keyword")
                .order(Terms.Order.term(true));
複製程式碼

Significant Terms Aggregation

Range Aggregation範圍聚合

Date Range Aggregation日期聚合

Ip Range Aggregation Ip範圍聚合

Histogram Aggregation直方圖聚合

Date Histogram Aggregation日期範圍直方圖聚合

Geo Distance Aggregation地理距離聚合

Geo Hash Grid Aggregation GeoHash網格聚合

Query DSL

Match All Query

匹配所有文件

QueryBuilder qb = matchAllQuery();
複製程式碼

Full text Query

match Query 匹配查詢

模糊匹配和欄位片語查詢

QueryBuilder qb = matchQuery("gender", "female");
複製程式碼

multi_mathc query 多欄位查詢

多個欄位進行查詢,欄位可以有多個

QueryBuilder qb = multiMatchQuery("female","gender", "message");
複製程式碼

common_terms query常用術語查詢

對一些比較專業的偏門詞語進行更加專業的查詢

QueryBuilder qb = commonTermsQuery("gender","female");
複製程式碼

query_string query查詢語句查詢

一種與Lucene查詢語法結合的查詢,允許使用特殊條件去查詢(AND|OR|NOT)

QueryBuilder qb = queryStringQuery("+male -female");
複製程式碼

simple_string query簡單查詢語句

一種簡單的查詢語法

QueryBuilder qb = queryStringQuery("+male -female");
複製程式碼

Term level Query

Term Query項查詢

在指定欄位中查詢確切的值的文件

QueryBuilder qb = termQuery("gender","male");
複製程式碼

Terms Query多項查詢

查詢一個欄位內的多個確切的值

QueryBuilder qb = termsQuery("age","10", "20");
複製程式碼

Range Query範圍查詢

範圍查詢

  • gte():範圍查詢將匹配欄位值大於或等於此引數值的文件
  • gt():範圍查詢將匹配欄位值大於此引數值的文件
  • lte():範圍查詢將匹配欄位值小於或等於此引數值的文件
  • lt():範圍查詢將匹配欄位值小於此引數值的文件
  • from()開始值to()結果值,這兩個函式與includeLower()和includeUpper()函式配套使用
  • includeLower(true)表示from()查詢將匹配欄位值大於或等於此引數值的文件
  • includeLower(false)表示from()查詢將匹配欄位值大於此引數值的文件
  • includeUpper(true)表示to()查詢將匹配欄位值小於或等於此引數值的文件
  • includeUpper(false)表示to()查詢將匹配欄位值小於此引數值的文件
QueryBuilder qb = QueryBuilders.rangeQuery("age").gte(10).includeLower(true).lte(20).includeUpper(true);
複製程式碼

其中,includeLower()和includeUpper()方法表示這個範圍是否包含查詢

Exists Query存在查詢

根據指定的欄位名查詢是否存在

QueryBuilder qb = existsQuery("user");
複製程式碼

Prefix Query字首查詢

根據指定欄位名和指定精確字首進行查詢

QueryBuilder qb = prefixQuery("gender","m");
複製程式碼

Wildcard Query萬用字元查詢

萬用字元查詢,指定欄位名和萬用字元。其中?表示單字元萬用字元,*表示多字元萬用字元。萬用字元查詢的欄位都是未經過分析的欄位

QueryBuilder qb = wildcardQuery("gender","f?*");
複製程式碼

Regexp Query正規表示式查詢

根據指定欄位名和正規表示式進行查詢。查詢的欄位也是未經過分析的欄位

QueryBuilder qb = regexpQuery("gender","f.*");
複製程式碼

Fuzzy Query模糊查詢

模糊查詢:指定的確切的欄位名和拼寫錯誤的查詢內容

QueryBuilder qb = fuzzyQuery("gender","mala").fuzziness(Fuzziness.ONE);
複製程式碼

Type Query型別查詢

查詢指定型別的文件

QueryBuilder qb = typeQuery("tweet");
複製程式碼

Ids Query ID查詢

根據type型別和ID查詢,type型別可以不寫

QueryBuilder qb = idsQuery("tweet").addIds("1", "11");
複製程式碼

相關文章