ElasticSearch dense_vector向量写入-Java实现

修破立生

已于 2023-05-31 20:04:54 修改

阅读量1.4k

点赞数 2

分类专栏： ElasticSearch 文章标签： elasticsearch java 大数据

于 2023-05-31 19:30:36 首次发布

本文链接：https://blog.youkuaiyun.com/weixin_47298890/article/details/130974378

版权

ElasticSearch 专栏收录该内容

4 篇文章

订阅专栏

本文详细介绍了如何利用JavaHighlevelAPI对ElasticSearch中的dense_vector类型的向量进行索引，包括创建索引、单个文档的索引以及批量索引文档的步骤。示例代码展示了如何构建和执行这些操作。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文章目录

1. 介绍

本文介绍了如何使用Java High level API 完成dense_vector类型向量的写入，内容包含了单个文档的索引和批量文档的索引。

2. ElasticSearch 索引设计

PUT caster_vector1
{
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 2
  },
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "dims": 2
      },
      "my_text": {
        "type": "text"
      }
    }
  }
}

3. 索引单个文档

package com.example.elasticsearchdemo;

import org.apache.http.HttpHost;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

public class IndexDoc {
    public static void main(String[] args) throws IOException {
        // 创建一个RestHighLevelClient对象
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("192.168.209.3", 9200, "http")));

        // 定义索引名称和类型名称
        String indexName = "caster_vector1";
        String typeName = "_doc";

        // 定义文档数据
        Map<String, Object> document = new HashMap<>();
        document.put("my_text", "text8 window");
        document.put("my_vector", new double[]{1, -1});

       // 创建IndexRequest对象并设置索引名称、类型名称和文档数据
        IndexRequest indexRequest = new IndexRequest(indexName, typeName).source(document);

        // 执行索引操作
        try {
            IndexResponse indexResponse = client.index(indexRequest, RequestOptions.DEFAULT);
            System.out.println(indexResponse.toString());
        } catch (IOException e) {
            e.printStackTrace();
        }

        // 关闭RestHighLevelClient对象
        client.close();
    }
}

在这个案例中，我们首先创建了一个RestHighLevelClient对象，并定义了索引名称和类型名称。然后，我们定义了文档数据，并使用Map对象来保存键值对。
接着，我们创建了一个IndexRequest对象，并通过它设置了索引名称、类型名称和文档数据。
最后，我们执行了索引操作，并输出了响应结果。如果索引成功，将会输出类似如下的信息：

IndexResponse[index=caster_vector1,type=_doc,id=hM43cYgBJNRf0nv1W09p,version=1,result=created,seqNo=6,primaryTerm=1,shards={"total":1,"successful":1,"failed":0}]

4. 批量索引文档

package com.example.elasticsearchdemo;

import org.apache.http.HttpHost;
import org.elasticsearch.action.bulk.BulkItemResponse;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;

import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class BatchIndexDoc {
    public static void main(String[] args) throws IOException {
        // 创建一个RestHighLevelClient对象
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("192.168.209.3", 9200, "http")));

        // 定义索引名称和类型名称
        String indexName = "caster_vector1";
        String typeName = "_doc";

       // 创建BulkRequest对象
        BulkRequest bulkRequest = new BulkRequest();

       // 定义要插入的文档数据
        List<Map<String, Object>> documents = new ArrayList<>();
        Map<String, Object> document1 = new HashMap<>();
        document1.put("my_text","text8 window");
        document1.put("my_vector", new double[]{1, -1});
        documents.add(document1);

        Map<String, Object> document2 = new HashMap<>();
        document2.put("my_text","another text");
        document2.put("my_vector", new double[]{-1, 1});
        documents.add(document2);

        // 循环遍历文档数据，将每个文档添加到BulkRequest对象中
        for (Map<String, Object> document : documents) {
            IndexRequest indexRequest = new IndexRequest(indexName, typeName).source(document);
            bulkRequest.add(indexRequest);
        }

       // 执行批量索引操作
        try {
            BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);
            if (bulkResponse.hasFailures()) {
                for (BulkItemResponse item : bulkResponse.getItems()) {
                    if (item.isFailed()) {
                        System.out.println(item.getFailureMessage());
                    }
                }
            } else {
                System.out.println("All documents have been indexed.");
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

        // 关闭RestHighLevelClient对象
        client.close();
    }
}