Elasticsearch是一個搜尋和分析引擎,適合需要靈活過濾的場景。有時,我們需要檢索所請求的資料及其聚合資訊。 在本教程中,我們將探討如何做到這一點。
Elasticsearch 聚合搜尋
讓我們首先探索 Elasticsearch 的聚合功能。
一旦我們在 localhost 上執行了 Elasticsearch 例項,我們就建立一個名為store-items 的索引,其中包含一些文件:
POST http:<font>//localhost:9200/store-items/_doc<i> { "type": "Multimedia", "name": "PC Monitor", "price": 1000 } ... POST http://localhost:9200/store-items/_doc<i> { "type": "Pets", "name": "Dog Toy", "price": 10 }
|
現在,讓我們在不應用任何過濾器的情況下查詢它:GET http://localhost:9200/store-items/_search
現在讓我們看一下響應:
{ ... <font>"hits": { "total": { "value": 5, "relation": "eq" }, "max_score": 1.0, "hits": [ { "_index": "store-items", "_type": "_doc", "_id": "J49VVI8B6ADL84Kpbm8A", "_score": 1.0, "_source": { "_class": "com.baeldung.model.StoreItem", "type": "Multimedia", "name": "PC Monitor", "price": 1000 } }, { "_index": "store-items", "_type": "_doc", "_id": "KI9VVI8B6ADL84Kpbm8A", "_score": 1.0, "_source": { "type": "Pets", "name": "Dog Toy", "price": 10 } }, ... ] } }
|
我們在回覆中提供了一些與商店物品相關的文件。每個文件對應於特定型別的商店專案。接下來,假設我們想知道每種型別有多少個專案。讓我們將聚合部分新增到請求正文並再次搜尋索引:
GET http:<font>//localhost:9200/store-items/_search<i> { "aggs": { "type_aggregation": { "terms": { "field": "type" } } } }
|
我們新增了名為type_aggregation的聚合,它使用術語聚合。正如我們在響應中看到的,有一個新的聚合部分,我們可以在其中找到有關每種型別的文件數量的資訊:
{ ... <font>"aggregations": { "type_aggregation": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "Multimedia", "doc_count": 2 }, { "key": "Pets", "doc_count": 2 }, { "key": "Home tech", "doc_count": 1 } ] } } }
|
Spring Data Elasticsearch 聚合搜尋
讓我們使用Spring Data Elasticsearch實現上一節中的功能。讓我們首先新增依賴項:
<dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-elasticsearch</artifactId> </dependency>
|
下一步,我們提供一個 Elasticsearch 配置類:@Configuration @EnableElasticsearchRepositories(basePackages = <font>"com.baeldung.spring.data.es.aggregation.repository") @ComponentScan(basePackages = "com.baeldung.spring.data.es.aggregation") public class ElasticSearchConfig { @Bean public RestClient elasticsearchRestClient() { return RestClient.builder(HttpHost.create("localhost:9200")) .setHttpClientConfigCallback(httpClientBuilder -> { httpClientBuilder.addInterceptorLast((HttpResponseInterceptor) (response, context) -> response.addHeader("X-Elastic-Product", "Elasticsearch")); return httpClientBuilder; }) .build(); } @Bean public ElasticsearchClient elasticsearchClient(RestClient restClient) { return ElasticsearchClients.createImperative(restClient); } @Bean(name = { "elasticsearchOperations", "elasticsearchTemplate" }) public ElasticsearchOperations elasticsearchOperations( ElasticsearchClient elasticsearchClient) { ElasticsearchTemplate template = new ElasticsearchTemplate(elasticsearchClient); template.setRefreshPolicy(null); return template; } }
|
在這裡,我們指定了一個低階 Elasticsearch REST 客戶端及其實現ElasticsearchOperations介面的包裝器 bean。現在,讓我們建立一個StoreItem實體:@Document(indexName = <font>"store-items") public class StoreItem { @Id private String id; @Field(type = Keyword) private String type; @Field(type = Keyword) private String name; @Field(type = Keyword) private Long price; //getters and setters<i> }
|
我們使用了與上一節相同的商店專案索引。由於我們無法使用 Spring Data 儲存庫的內建功能來檢索聚合,因此我們需要建立一個儲存庫擴充套件。讓我們建立一個擴充套件介面:public interface StoreItemRepositoryExtension { SearchPage<StoreItem> findAllWithAggregations(Pageable pageable); }
|
這裡我們有findAllWithAggregations()方法,它使用Pageable介面實現並返回包含我們的專案的SearchPage。接下來,讓我們建立該介面的實現:@Component public class StoreItemRepositoryExtensionImpl implements StoreItemRepositoryExtension { @Autowired private ElasticsearchOperations elasticsearchOperations; @Override public SearchPage<StoreItem> findAllWithAggregations(Pageable pageable) { Query query = NativeQuery.builder() .withAggregation(<font>"type_aggregation", Aggregation.of(b -> b.terms(t -> t.field("type")))) .build(); SearchHits<StoreItem> response = elasticsearchOperations.search(query, StoreItem.class); return SearchHitSupport.searchPageFor(response, pageable); } }
|
我們構建了本機查詢,併合並了聚合部分。按照上一節的模式,我們使用type_aggregation作為聚合名稱。然後,我們利用術語聚合型別來計算響應中每個指定欄位的文件數。最後,讓我們建立一個 Spring Data 儲存庫,在其中擴充套件ElasticsearchRepository以支援通用 Spring Data 功能,並擴充套件StoreItemRepositoryExtension以合併我們的自定義方法實現:
@Repository public interface StoreItemRepository extends ElasticsearchRepository<StoreItem, String>, StoreItemRepositoryExtension { }
|
之後,讓我們為聚合功能建立一個測試:@ExtendWith(SpringExtension.class) @ContextConfiguration(classes = ElasticSearchConfig.class) public class ElasticSearchAggregationManualTest { private static final List<StoreItem> EXPECTED_ITEMS = List.of( new StoreItem(<font>"Multimedia", "PC Monitor", 1000L), new StoreItem("Multimedia", "Headphones", 200L), new StoreItem("Home tech", "Barbecue Grill", 2000L), new StoreItem("Pets", "Dog Toy", 10L), new StoreItem("Pets", "Cat shampoo", 5L)); ... @BeforeEach public void before() { repository.saveAll(EXPECTED_ITEMS); } ... }
|
我們建立了一個包含五個商品的測試資料集,其中每種型別都有一些商店商品。在測試用例開始執行之前,我們將這些資料填充到 Elasticsearch 中。繼續,讓我們呼叫findAllWithAggregations()方法並看看它返回什麼:@Test void givenFullTitle_whenRunMatchQuery_thenDocIsFound() { SearchHits<StoreItem> searchHits = repository.findAllWithAggregations(Pageable.ofSize(2)) .getSearchHits(); List<StoreItem> data = searchHits.getSearchHits() .stream() .map(SearchHit::getContent) .toList(); Assertions.assertThat(data).containsAll(EXPECTED_ITEMS); Map<String, Long> aggregatedData = ((ElasticsearchAggregations) searchHits .getAggregations()) .get(<font>"type_aggregation") .aggregation() .getAggregate() .sterms() .buckets() .array() .stream() .collect(Collectors.toMap(bucket -> bucket.key() .stringValue(), MultiBucketBase::docCount)); Assertions.assertThat(aggregatedData).containsExactlyInAnyOrderEntriesOf( Map.of("Multimedia", 2L, "Home tech", 1L, "Pets", 2L)); }
|
正如我們在響應中看到的,我們已經檢索了搜尋命中,從中我們可以提取準確的查詢結果。此外,我們還檢索了聚合資料,其中包含搜尋結果的所有預期聚合。