最终效果

本文分享,ES千万级向量检索耗时分钟级的慢查询分析方法,并分享优化方案。通过借助内存加速,把查询延迟从分钟级降低到毫秒级别

方案缺点是对服务器内存有比较大的依赖!

主要问题:剔除knn插件,此插件在做ANN检索时,构建查询语句耗时长。

1.背景

1.1 资源背景

es.8.8版本

2个es节点 ; 堆内存31g; 服务器内存资源充足(100+); HDD磁盘

该优化是在forcemerge之后做的工作,如果不做forcemerge,效果会更差。即使做完forcemerge,还是不能满足查询延迟要求。

1.2 数据背景

1799w数据,向量768维度。(不带副本300G 10个分片)

在数据中做ANN检索。检索语句在2.1中。

knn 参数:"num_candidates": 100

Elasticsearch向量检索(KNN)千万级耗时长问题分析与优化方案_数据

耗时长,无响应结果,时间大于1分钟。

Elasticsearch向量检索(KNN)千万级耗时长问题分析与优化方案_千万_02

  1. 问题定位排查

2.1 检索语句

为了方便查阅,去掉了向量的数据。

GET tilake_vectors-000003/_search?max_concurrent_shard_requests=30&human=true
{
  "profile": true, 
  "knn": {
    "field": "content_vector",
    "filter": {
      "bool": {
        "must": [
          {
            "terms": {
              "session_id": [
                "institute"
              ]
            }
          },
          {
            "term": {
              "vectorization_method": "title+content"
            }
          }
        ]
      }
    },
    "query_vector": [],
    "k": 10,
    "num_candidates": 10
  },
  "size": 0
}
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.

2.2 检索语句profile结果

{
  "took": 10006,
  "timed_out": false,
  "_shards": {
    "total": 2,
    "successful": 2,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "profile": {
    "shards": [
      {
        "id": "[oooFp749QMWECSF0qyMaIA][tilake_vectors-000003][1]",
        "dfs": {
          "statistics": {
            "type": "statistics",
            "description": "collect term statistics",
            "time": "6.9micros",
            "time_in_nanos": 6923,
            "breakdown": {
              "term_statistics": 0,
              "collection_statistics": 0,
              "collection_statistics_count": 0,
              "create_weight": 4668,
              "term_statistics_count": 0,
              "rewrite_count": 0,
              "create_weight_count": 1,
              "rewrite": 0
            }
          },
          "knn": [
            {
              "query": [
                {
                  "type": "DocAndScoreQuery",
                  "description": "DocAndScore[10]",
                  "time": "6.5micros",
                  "time_in_nanos": 6587,
                  "breakdown": {
                    "set_min_competitive_score_count": 0,
                    "match_count": 0,
                    "shallow_advance_count": 0,
                    "set_min_competitive_score": 0,
                    "next_doc": 916,
                    "match": 0,
                    "next_doc_count": 10,
                    "score_count": 10,
                    "compute_max_score_count": 0,
                    "compute_max_score": 0,
                    "advance": 524,
                    "advance_count": 1,
                    "count_weight_count": 0,
                    "score": 1228,
                    "build_scorer_count": 2,
                    "create_weight": 1228,
                    "shallow_advance": 0,
                    "count_weight": 0,
                    "create_weight_count": 1,
                    "build_scorer": 2691
                  }
                }
              ],
              "rewrite_time": 9320075980,
              "collector": [
                {
                  "name": "SimpleTopScoreDocCollector",
                  "reason": "search_top_hits",
                  "time": "10.4micros",
                  "time_in_nanos": 10460
                }
              ]
            }
          ]
        },
        "searches": [
          {
            "query": [
              {
                "type": "ConstantScoreQuery",
                "description": "ConstantScore(ScoreAndDocQuery)",
                "time": "49.4micros",
                "time_in_nanos": 49494,
                "breakdown": {
                  "set_min_competitive_score_count": 0,
                  "match_count": 0,
                  "shallow_advance_count": 0,
                  "set_min_competitive_score": 0,
                  "next_doc": 0,
                  "match": 0,
                  "next_doc_count": 0,
                  "score_count": 0,
                  "compute_max_score_count": 0,
                  "compute_max_score": 0,
                  "advance": 0,
                  "advance_count": 0,
                  "count_weight_count": 1,
                  "score": 0,
                  &