Spring AI 进阶之路03：集成RAG构建高效知识库

最新推荐文章于 2025-11-01 15:31:58 发布

原创

最新推荐文章于 2025-11-01 15:31:58 发布 · 787 阅读

27 ·

CC 4.0 BY-SA版权

文章标签：

#spring #人工智能 #java #spring boot #python #开发语言 #后端

引子

在前两篇文章中，我们已经成功地将 LLM 集成进了Spring Boot，并实现了流畅的流式对话体验。但我们很快会发现一个核心问题：通用大模型虽然知识渊博，但它对我们的私域知识（比如公司内部的产品文档、技术手册、个人笔记等）一无所知。它无法回答“我们最新的XX产品有哪些特性？”这类具体问题。

要解决这个问题，就轮到今天的主角登场了——RAG（Retrieval-Augmented Generation，检索增强生成）。

简单来说，RAG技术就像是为大模型外挂了一个“知识U盘”。它允许我们在对话时，先从我们自己的知识库（向量数据库）中检索出与问题最相关的信息，然后将这些信息连同原始问题一起“喂”给大模型，让它基于这些“参考资料”来生成精准的回答。

本文将手把手带你走完整个流程：从搭建向量数据库，到将私有文档“喂”给AI，最终实现一个可以随时在“通用模式”和“知识库模式”间切换的智能聊天机器人。

环境准备

要实现RAG，我们首先需要一个能够存储“知识”的地方，这个地方就是向量数据库。它专门用于存储文本转换后的向量（一堆数字），并能高效地进行相似度检索。

市面上有多种选择，如 Chroma、Milvus 等。本次我们选用 Redis Stack，因为它不仅是我们熟悉的 Redis，其集成的 RediSearch 模块还提供了强大的向量存储和检索能力，对于Java开发者来说非常友好。

我们使用 Docker 来快速启动它，整个过程分为两步：

1.拉取最新镜像

docker pull redis/redis-stack:latest

2.运行容器

docker run -d --name redis-stack -p 9379:6379 -e REDIS_ARGS="--requirepass 123456" redis/redis-stack:latest

说明：我们通过 -p 9379:6379 将容器的 6379 端口映射到宿主机的 9379，避免与本地其他Redis实例冲突。同时，通过 -e 设置了访问密码为 123456。

运行成功后，向量数据库就已经准备就绪了！

项目整合

1.添加 Maven 依赖

接下来，我们需要在 Spring Boot 项目中引入相关依赖并进行配置。

<!-- Spring AI 内嵌式 Embedding 模型 Starter -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-vector-store-redis</artifactId>
</dependency>

<!-- Spring AI 内嵌式 Embedding 模型 Starter -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-transformers</artifactId>
</dependency>


<!-- Spring AI 文档读取器，用于解析多种格式的文档 -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-tika-document-reader</artifactId>
</dependency>

依赖说明：

spring-ai-starter-vector-store-redis：核心依赖，提供了与 Redis 向量数据库交互的所有能力。
spring-ai-starter-model-transformers：这个依赖非常关键，它会自动下载并运行一个本地的 Embedding 模型。这个模型负责将我们的文本（如“什么是RAG？”）转换成数学向量。这意味着我们无需再调用外部 API 来进行向量化，实现了本地处理。
spring-ai-tika-document-reader：一个强大的文档解析工具，让我们的应用能直接读取 .txt, .pdf, .docx 等多种格式的文件。

2.增加 application.yml 配置

在 application.yml 中，添加与 Redis 向量数据库相关的配置：

spring:
  data:
    redis:
      host: 127.0.0.1
      port: 9379
      password: 123456
  ai:
    vectorstore:
      redis:
        initialize-schema: true # 是否初始化所需的模式
        index-name: lee-vectorstore # 用于存储向量的索引名称
        prefix: 'lee:' # redis 键的前缀

配置完成后，Spring AI 会在应用启动时自动连接到 Redis，并根据配置创建好用于存储向量的索引。

构建知识库上传与处理流程

万事俱备，让我们开始编写代码，实现文档的上传、解析、切分和向量化存储。

1.前端改造：提供上传入口和模式开关

首先，我们需要改造前端页面，增加两个核心功能：一个文件上传按钮，用于提交我们的知识库文档；一个**“知识库”开关**，用于在普通对话和RAG对话之间自由切换。

<!DOCTYPE html>
<html lang="zh-CN">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>RAG 增强流式对话</title>
    <style>
        body {
            font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif;
            background-color: #f4f7f9;
            margin: 0;
            display: flex;
            justify-content: center;
            align-items: center;
            height: 100vh;
        }
        .chat-container {
            width: 90%;
            max-width: 800px;
            height: 90vh;
            background-color: #fff;
            border-radius: 12px;
            box-shadow: 0 4px 20px rgba(0, 0, 0, 0.1);
            display: flex;
            flex-direction: column;
            overflow: hidden;
            position: relative;
        }
        .chat-header {
            background-color: #4a90e2;
            color: white;
            padding: 16px;
            font-size: 1.2em;
            text-align: center;
            font-weight: bold;
        }
        .chat-messages {
            flex-grow: 1;
            padding: 20px;
            overflow-y: auto;
            display: flex;
            flex-direction: column;
            gap: 15px;
        }
        .message {
            padding: 12px 18px;
            border-radius: 18px;
            max-width: 75%;
            line-height: 1.5;
            word-wrap: break-word;
        }
        .user-message {
            background-color: #dcf8c6;
            align-self: flex-end;
            border-bottom-right-radius: 4px;
        }
        .bot-message {
            background-color: #e9e9eb;
            align-self: flex-start;
            border-bottom-left-radius: 4px;
        }
        .chat-input-area {
            display: flex;
            padding: 15px;
            border-top: 1px solid #e0e0e0;
            background-color: #f9f9f9;
            align-items: center;
        }
        #message-input {
            flex-grow: 1;
            padding: 12px;
            border: 1px solid #ccc;
            border-radius: 20px;
            resize: none;
            font-size: 1em;
            margin-right: 10px;
        }
        #send-button {
            padding: 12px 25px;
            border: none;
            background-color: #4a90e2;
            color

最低0.47元/天解锁文章