四、Elasticsearch使用JAVA客户端调用REST API

原创已于 2023-09-10 20:13:25 修改 · 372 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#elasticsearch #java

于 2023-09-03 21:55:09 首次发布

Elasticsearch 专栏收录该内容

4 篇文章

订阅专栏

本文介绍了Java客户端操作Elasticsearch的多种方法，包括创建和查看apiKey、索引的创建、查询与删除，文档的增删改查，以及多种查询方式，如match、matchPhrase、term、fuzzy等关键字查询，还有聚合查询和文章分词统计等内容。

JAVA客户端操作ES

官网查看更多操作文档，点击查看Elasticsearch Clients文档，如JAVA客户端文档

创建apiKey

        CreateApiKeyResponse key = client.security().createApiKey(
                // 创建一个apiKey
                i -> i.name("my-es-apikey")
                        // 设置一天有效期，不设置默认永久
                        .expiration(Time.of(t -> t.time("1d")))
        );
        log.info(key);

查看apiKey

        QueryApiKeysResponse res = client.security().queryApiKeys();
        log.info(res);

创建第一个简单索引

        CreateIndexResponse res = client.indices().create(c -> c.index(ARTICLE_INDEX));
        log.info(res);

使用链式代码创建索引

        CreateIndexResponse res = client.indices().create(
                c -> c
                        .index(ARTICLE_INDEX)
                        .settings(
                                // 5个分片  1个副本
                                s -> s.numberOfShards("5").numberOfReplicas("1")
                        )
                        .mappings(
                                m -> m.properties("id",
                                                Property.of(
                                                        p -> p.integer(
                                                                i -> i
                                                                        .store(false)
                                                                        .index(false)
                                                        )
                                                )
                                        ).properties("title",
                                                Property.of(
                                                        p -> p.text(
                                                                i -> i
                                                                        .store(false)
                                                                        .index(true)
                                                                        .analyzer("ik_max_word")
                                                                        .searchAnalyzer("ik_smart")
                                                        )
                                                )
                                        )
                                        .properties("content",
                                                Property.of(
                                                        p -> p.text(
                                                                i -> i
                                                                        .store(false)
                                                                        .index(true)
                                                                        .analyzer("ik_max_word")
                                                                        .searchAnalyzer("ik_smart")
                                                        )
                                                )
                                        )
                                        .properties("releaseTime",
                                                Property.of(
                                                        p -> p.date(
                                                                i -> i
                                                                        .store(false)
                                                                        .index(true)
                                                        )
                                                )
                                        )
                                        .properties("type",
                                                Property.of(
                                                        p -> p.keyword(
                                                                i -> i
                                                                        .store(false)
                                                                        .index(true)
                                                        )
                                                )
                                        )
                                        .properties("collectCount",
                                                Property.of(
                                                        p -> p.integer(
                                                                i -> i
                                                                        .store(false)
                                                                        .index(true)
                                                        )
                                                )
                                        )
                                        .properties("collectors",
                                                Property.of(
                                                        p -> p.text(
                                                                i -> i
                                                                        .store(false)
                                                                        .index(true)
                                                                        .analyzer("ik_smart")
                                                        )
                                                )
                                        )
                        )
        );
        log.info(res);

使用模版代码创建索引

首先创建一个mappings模版文件，放在src/main/resources/templates/mappings.json目录下，模版文件内容如下

{
  "properties": {
    "id": {
      "type": "long",
      "store": false,
      "index": false
    },
    "title": {
      "type": "text",
      "store": false,
      "index": true,
      "analyzer": "ik_max_word",
      "search_analyzer": "ik_smart"
    },
    "content": {
      "type": "text",
      "store": false,
      "index": true,
      "analyzer": "ik_max_word",
      "search_analyzer": "ik_smart"
    },
    "releaseTime": {
      "type": "date",
      "store": false,
      "index": true
    },
    "collectCount": {
      "type": "integer",
      "store": false,
      "index": true
    },
    "type": {
      "type": "keyword",
      "store": false,
      "index": true
    },
    "collectors": {
      "type": "text",
      "store": false,
      "index": true,
      "analyzer": "ik_smart"
    }
  }
}

b. 读取模版文件作为mappings并创建索引

        try (
                InputStream is = getClass().getResourceAsStream("/templates/mappings.json")
                //InputStream is = getClass().getClassLoader().getResourceAsStream("templates/mappings.json")
        ) {
            JsonpMapper jsonpMapper = client._transport().jsonpMapper();
            JsonParser parser = jsonpMapper.jsonProvider().createParser(is);
            CreateIndexResponse res = client.indices().create(
                    c -> c
                            .index(ARTICLE_INDEX)
                            .settings(
                                    // 5个分片  1个副本
                                    s -> s.numberOfShards("5").numberOfReplicas("1")
                            )
                            .mappings(
                                    TypeMapping._DESERIALIZER.deserialize(parser, jsonpMapper)
                            )
            );
            log.info(res);
        }

查询索引

        // 查询单个索引
        GetIndexResponse getIndexResponse = client.indices().get(s -> s.index(ARTICLE_INDEX));
        Map<String, IndexState> result = getIndexResponse.result();
        result.forEach((k, v) -> log.info(k + "=" + v));
        // 查看全部索引
        IndicesResponse indicesResponse = client.cat().indices();
        indicesResponse.valueBody().forEach(log::info);

删除单个索引

        // 删除单个索引
        DeleteIndexResponse res = client.indices().delete(c -> c.index(ARTICLE_INDEX));
        log.info(res);

新增文档

        // 方式1
        Article one = initArticle();
        IndexResponse oneRes = client.index(s ->
                s.index(ARTICLE_INDEX)
                        .id(one.getIdStr())
                        .document(one)
        );
        log.info(oneRes);

        // 方式2
        Article two = initArticle();
        IndexResponse twoRes = client.index(
                IndexRequest.of(i -> i
                        .index(ARTICLE_INDEX)
                        .id(two.getIdStr())
                        .document(two))
        );
        log.info(twoRes);

        // 方式3，本质上就是方式1
        Article three = initArticle();
        IndexResponse threeRes = client.index(
                new IndexRequest.Builder<>().index(ARTICLE_INDEX)
                        .id(three.getIdStr())
                        .document(three).build()
        );
        log.info(threeRes);

简单查询公文，通过id

        GetResponse<Article> getResponse = client.get(s -> s.index(ARTICLE_INDEX).id("1"), Article.class);
        Article article = getResponse.source();
        if (getResponse.found()) {
            log.info(article);
        }

判断文档是否存在

        BooleanResponse booleanResponse = client.exists(s -> s.index(ARTICLE_INDEX).id("4"));
        log.info("公文是否存在:" + booleanResponse.value());

判断索引中的文档数量

        long count = client.count(s -> s.index(ARTICLE_INDEX)).count();
        log.info(count);

更新公文

        Article article = initArticle();
        UpdateResponse<Article> response = client.update(e -> e
                        .index(ARTICLE_INDEX)
                        .id(article.getIdStr())
                        .doc(article),
                Article.class
        );
        log.info(response);

删除文档

        DeleteResponse res = client.delete(s -> s.index(ARTICLE_INDEX).id("1"));
        log.info(res);

批量插入，方式1

        List<BulkOperation> bulkOperations = new ArrayList<>();
        for (int i = 0; i < 20; i++) {
            // 随机生成20份文档
            Article article = initArticle();
            bulkOperations.add(BulkOperation.of(b ->
                    b.index(
                            c -> c.id(article.getIdStr()).document(article)
                    )));
        }
        BulkResponse res = client.bulk(s -> s.index(ARTICLE_INDEX).operations(bulkOperations));
        res.items().forEach(log::info);

批量插入，方式2

        BulkRequest.Builder br = new BulkRequest.Builder();
        for (int i = 0; i < 20; i++) {
            // 随机生成20份文档
            Article article = initArticle();
            br.operations(op -> op.index(idx -> idx
                    .index(ARTICLE_INDEX)
                    .id(article.getIdStr())
                    .document(article)));
        }
        BulkResponse res = client.bulk(br.build());
        res.items().forEach(log::info);

批量删除，方式1

        BulkRequest.Builder br = new BulkRequest.Builder().index(ARTICLE_INDEX);
        for (String id : Arrays.asList("1", "2", "3")) {
            br.operations(op -> op.delete(c -> c.id(id)));
        }
        BulkResponse bulkResponseTwo = client.bulk(br.build());
        bulkResponseTwo.items().forEach(a -> log.info(a.result()));
        log.error(bulkResponseTwo.errors());

批量删除，方式2

        List<BulkOperation> bulkOperations = new ArrayList<>();
        for (String id : Arrays.asList("1", "2", "3")) {
            bulkOperations.add(
                    BulkOperation.of(b ->
                            b.delete(c -> c.id(id))
                    )
            );
        }
        BulkResponse bulkResponse = client.bulk(a -> a.index(ARTICLE_INDEX).operations(bulkOperations));
        bulkResponse.items().forEach(log::info);

match关键字查询

语段会被先分词再查询

        SearchResponse<Article> res = client.search(
                c -> c.index(ARTICLE_INDEX)
                        .query(
                                q -> q.match(
                                        t -> t.field("content").query("这架势有如在打仗")
                                )
                        )
                ,
                Article.class
        );
        res.hits().hits().forEach(log::info);

matchPhrase查询

查询满足分词在文档中出现的顺序和搜索词中一致且各搜索词之间必须紧邻，因此match_phrase也可以叫做紧邻搜索，例如

📌文档"无疑便如火上加油"会被ik_max_word分词建立倒排索引（创建索引时配置的分词器），索引如下: 无疑[0], 便[1], 如火[2], 火上加油[3], 火山[4], 加油[5]
在搜索"如火上"时，被ik_smart分词器（创建索引时配置的搜索分词器）分词成如[0], 火上[1], 很显然和上面分词结果对不上，结果搜索不到
如搜索"便如火"时，被ik_smart分词器（创建索引时配置的搜索分词器）分词成便[0], 如火[1]，此时拼接结果就对得上，结果可以搜索得到
如搜索"无疑如火"时，被ik_smart分词器（创建索引时配置的搜索分词器）分词成无疑[0], 如火[1]，很显然和上面分词结果对不上，结果搜索不到，如果想要匹配到
可以加参数slop，默认是0，意思是查询分词只需要经过最大距离为0的转换就可以对得上原文档的词组，原文档分词结果：无疑[0], 便[1], 如火[2]，这中间隔着1个距离，所以配置slop大于0的数就能匹配到

        String searchText = "无疑如火";
        SearchResponse<Article> res = client.search(
                c -> c.index(ARTICLE_INDEX)
                        .query(
                                q -> q.matchPhrase(
                                        m -> m
                                                .field("content")
                                                .query(searchText)
                                        .slop(1)
                                )
                        )
                ,
                Article.class
        );
        log.info(res);

term关键字查询

term关键字不会再进行分词操作，会直接拿当前查询字词去分词结果中全匹配查询

📌文档"无疑便如火上加油"会被ik_max_word分词建立倒排索引（创建索引时配置的分词器），索引如下:
无疑[0], 便[1], 如火[2], 火上加油[3], 火上[4], 加油[5]
在搜索"如火上"时，很明显，分词结果不存在以上词，所以搜不到，如果搜索"如火"，就可以搜索得到

        String searchText1 = "火上加油";
        String searchText2 = "火上";
        SearchResponse<Article> res1 = client.search(
                c -> c.index(ARTICLE_INDEX)
                        .query(
                                q -> q.term(
                                        m -> m.field("content").value(searchText1)
                                )
                        )
                ,
                Article.class
        );
        log.info(res1);

        // 同term类似， terms是指任意匹配其中一个就可以
        SearchResponse<Article> res2 = client.search(
                c -> c.index(ARTICLE_INDEX)
                        .query(
                                q -> q.terms(
                                        m -> m.field("content").terms(
                                                TermsQueryField.of(
                                                        t -> t.value(
                                                                Arrays.asList(FieldValue.of(searchText1), FieldValue.of(searchText2))
                                                        )
                                                )
                                        )
                                )
                        )
                ,
                Article.class
        );
        log.info(res2);

        // 如果要两个关键字都同时匹配，则用must配合term
        SearchResponse<Article> res3 = client.search(
                c -> c.index(ARTICLE_INDEX)
                        .query(
                                q -> q.bool(
                                        b -> b
                                                .must(
                                                        TermQuery.of(
                                                                tq -> tq.field("content").value(searchText1)
                                                        )._toQuery(),
                                                        TermQuery.of(
                                                                tq -> tq.field("content").value(searchText2)
                                                        )._toQuery()
                                                )
                                )
                        )
                ,
                Article.class
        );
        log.info(res3);

fuzzy关键字查询

📌属于模糊查询，例如原文本是"催眠"，模糊查询为"催不眠"，fuzziness代表可以与关键词有误差的字数，可选值为0，1，2三项

        SearchResponse<Article> res = client.search(s -> s.index(ARTICLE_INDEX)
                        .query(
                                q -> q.fuzzy(
                                        f -> f.field("content").value("催不眠").fuzziness("1")
                                ))
                        .source(
                                source -> source.filter(
                                        f -> f.includes("content")
                                )
                        )
                , Article.class);
        res.hits().hits().forEach(log::info);

复合嵌套查询

        // 查找澳门的投资等关键字的文章同时收藏量在2人以上的文
        SearchResponse<Article> res = client.search(
                s -> s.index(ARTICLE_INDEX)
                        .query(
                                q -> q.bool(
                                        b -> b.must(
                                                        MatchQuery.of(
                                                                m -> m.field("content")
                                                                        // 配置了operator表示文中必须同时出现澳门和投资这两个字，默认是or，只匹配一个
                                                                        .operator(Operator.And)
                                                                        .query("澳门的投资")
                                                        )._toQuery()
                                                )
                                                .must(
                                                        RangeQuery.of(
                                                                m -> m.field("collectCount").gte(JsonData.of(2))
                                                        )._toQuery()
                                                )
                                )
                        )
                , Article.class
        );
        res.hits().hits().forEach(log::info);

加载模版文件填充值进行查询

增加模板文件，src/main/resources/templates/query.json，内容如下

{
  "q-script": {
    "query": {
      "match": {
        "{{field}}": {
          "query": "{{value}}",
          "operator": "and"
        }
      }
    }
  }
}

b. 代码如下

        try (
                InputStream is = getClass().getResourceAsStream("/templates/query.json")
                //InputStream is = getClass().getClassLoader().getResourceAsStream("templates/query.json")
        ) {
            // 脚本对应的key
            String jsonTemplateId = "q-script";
            // 加载模版文件
            String jsonTemplate = new JSONObject(new String(IoUtil.readBytes(is))).getStr(jsonTemplateId);
            // 客户端创建模版
            client.putScript(
                    r -> r.id(jsonTemplateId).script(s -> s.lang(ScriptLanguage.Mustache).source(jsonTemplate))
            );
            // 指定模版查询
            SearchTemplateResponse<Article> res = client.searchTemplate(
                    t -> t.index(ARTICLE_INDEX).id(jsonTemplateId)
                            .params("field", JsonData.of("content"))
                            .params("value", JsonData.of("满意的答案")),
                    Article.class
            );
            res.hits().hits().forEach(log::info);
        }

自定义权重分进行查询

        String searchText = "理工大学";
        SearchResponse<Article> res = client.search(
                s -> s.index(ARTICLE_INDEX)
                        .query(
                                q -> q.bool(
                                        b -> b.should(
                                                TermQuery.of(
                                                        m -> m.field("content")
                                                                .value(searchText)
                                                                .boost(6f)
                                                )._toQuery(),
                                                MatchPhraseQuery.of(
                                                        m -> m.field("content")
                                                                .query(searchText)
                                                                .boost(2.5f)
                                                                .slop(4)
                                                )._toQuery()
                                        )
                                )
                        )
                        .from(0)
                        .size(15)
                        .sort(
                                sort -> sort.score(
                                        scoreSort -> scoreSort.order(SortOrder.Desc)
                                )
                        ).sort(
                                sort -> sort.field(
                                        // 文档对象自定义的score排序
                                        f -> f.field("collectCount").order(SortOrder.Desc)
                                )
                        )
                , Article.class
        );
        res.hits().hits().forEach(log::info);

分页查询

       SearchResponse<Article> res = client.search(s -> s
                        .index(ARTICLE_INDEX)
                        .query(q -> q.bool(
                                b -> b
                                        .must(
                                                RangeQuery.of(m -> m
                                                        .field("collectCount")
                                                        .gte(JsonData.of(2))
                                                )._toQuery()
                                        )
                                        .should(
                                                MatchQuery.of(
                                                        m -> m.field("content").query("感情")
                                                )._toQuery(),
                                                MatchQuery.of(
                                                        m -> m.field("content").query("大学")
                                                )._toQuery()
                                        )
                                        .minimumShouldMatch("1") // should最少满足一个，默认满足0个
                        ))
                        //分页查询，从满足条件的结果集中的第0页获取2个document
                        .from(0)
                        .size(2)
                        //按Id降序，如果不设置order，默认值为升序
                        .sort(f -> f.field(o -> o.field("collectCount").order(SortOrder.Desc))),
                Article.class
        );
        res.hits().hits().forEach(log::info);

查询所有并分页展示

        SearchResponse<Article> res = client.search(s -> s
                        .index(ARTICLE_INDEX)
                        .query(q -> q.matchAll(m -> m))
                        .from(0)
                        .size(8)
                        .sort(f -> f.field(o -> o.field("collectCount"))),
                Article.class
        );
        res.hits().hits().forEach(log::info);

查询文档并过滤source字段

        SearchResponse<Article> res = client.search(
                s -> s.index(ARTICLE_INDEX)
                        .query(q -> q.matchAll(m -> m))
                        .sort(
                                f -> f.field(o -> o.field("collectCount").order(SortOrder.Desc))
                        )
                        .source(
                                so -> so.filter(
                                        f -> f.includes("title", "content", "type")
                                )
                        )
                , Article.class
        );
        res.hits().hits().forEach(log::info);

命中字段高亮显示

        SearchResponse<Article> res = client.search(s -> s.index(ARTICLE_INDEX)
                        .query(
                                q -> q.match(m -> m.field("content").query("理工大学"))
                        )
                        .source(
                                source -> source.filter(
                                        f -> f.includes("content")
                                )
                        ).highlight(
                                h -> h.fields(
                                        "content", f -> f
                                                .preTags("<span style='color:red'>")
                                                .postTags("</span>")
                                )
                        )
                , Article.class);
        res.hits().hits().forEach(log::info);

聚合查询，最大值

        SearchResponse<Article> res = client.search(s -> s
                        .index(ARTICLE_INDEX)
                        .size(0)
                        .aggregations(
                                "maxId", a -> a.max(
                                        MaxAggregation.of(m -> m.field("id"))
                                )
                        )
                , Article.class);
        log.info(res.aggregations().get("maxId").max());

聚合查询，分组统计

        // 按文章类型分组
        SearchResponse<Article> res = client.search(s -> s
                        .index(ARTICLE_INDEX)
                        .size(0)
                        .aggregations(
                                "groupType", a -> a.terms(
                                        TermsAggregation.of(m -> m.field("type"))
                                )
                        )
                , Article.class);
        log.info(res.aggregations().get("groupType").sterms());

文章分词统计

        String s = FileUtil.readUtf8String("D:/dd.txt");
        AnalyzeResponse analyze = client.indices().analyze(
                i -> i.analyzer("ik_smart").text(s)
        );
        analyze.tokens().forEach(log::info);