Spring AI与RAG技术实战:构建企业级智能文档问答系统
引言
随着人工智能技术的快速发展,企业对于智能化文档处理的需求日益增长。传统的文档管理系统往往只能提供简单的关键词搜索,无法理解用户的自然语言查询意图。Spring AI结合RAG(检索增强生成)技术,为企业提供了一种全新的智能文档问答解决方案。本文将深入探讨如何利用Spring AI框架和RAG技术构建高效的企业级智能文档问答系统。
技术架构概述
核心组件
我们的智能文档问答系统采用以下技术栈:
- Spring AI: 作为核心AI框架,提供统一的AI服务接入
- RAG架构: 检索增强生成,结合语义搜索和LLM生成能力
- 向量数据库: 使用Milvus存储文档向量化表示
- Embedding模型: OpenAI或Ollama提供的文本嵌入服务
- Spring Boot: 作为后端服务框架
- Redis: 用于缓存和会话管理
系统架构设计
客户端 → Spring Boot应用 → RAG引擎 → 向量数据库
↓ ↓
LLM服务 文档存储
环境准备与依赖配置
Maven依赖配置
首先,在pom.xml中添加必要的依赖:
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>0.8.1</version>
</dependency>
<dependency>
<groupId>io.milvus</groupId>
<artifactId>milvus-sdk-java</artifactId>
<version>2.3.4</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
</dependencies>
配置文件设置
在application.yml中配置相关参数:
spring:
ai:
openai:
api-key: ${OPENAI_API_KEY}
base-url: https://api.openai.com/v1
data:
redis:
host: localhost
port: 6379
milvus:
host: localhost
port: 19530
核心功能实现
1. 文档处理与向量化
首先实现文档加载和向量化功能:
@Service
public class DocumentProcessor {
@Autowired
private EmbeddingClient embeddingClient;
@Autowired
private MilvusService milvusService;
public void processDocument(String documentId, String content) {
// 文本预处理
String processedContent = preprocessText(content);
// 分块处理
List<String> chunks = splitIntoChunks(processedContent);
// 向量化并存储
for (int i = 0; i < chunks.size(); i++) {
float[] embedding = embeddingClient.embed(chunks.get(i));
milvusService.storeEmbedding(documentId, i, embedding, chunks.get(i));
}
}
private String preprocessText(String text) {
// 移除特殊字符、标准化文本等
return text.replaceAll("[^\\w\\s]", "").toLowerCase();
}
private List<String> splitIntoChunks(String text) {
// 按段落或固定长度分块
return Arrays.asList(text.split("\\n\\n"));
}
}
2. 语义检索实现
实现基于向量相似度的语义检索:
@Service
public class SemanticSearchService {
@Autowired
private EmbeddingClient embeddingClient;
@Autowired
private MilvusService milvusService;
public List<SearchResult> search(String query, int topK) {
// 将查询语句向量化
float[] queryEmbedding = embeddingClient.embed(query);
// 在向量数据库中搜索相似文档
List<SearchResult> results = milvusService.searchSimilar(
queryEmbedding, topK
);
return results;
}
}
@Data
class SearchResult {
private String documentId;
private int chunkIndex;
private String content;
private float similarity;
}
3. RAG问答引擎
结合检索结果和LLM生成回答:
@Service
public class RAGQuestionAnsweringService {
@Autowired
private SemanticSearchService searchService;
@Autowired
private ChatClient chatClient;
public String answerQuestion(String question) {
// 检索相关文档片段
List<SearchResult> relevantDocs = searchService.search(question, 5);
// 构建提示词
String prompt = buildPrompt(question, relevantDocs);
// 调用LLM生成回答
String answer = chatClient.generate(prompt);
return answer;
}
private String buildPrompt(String question, List<SearchResult> docs) {
StringBuilder context = new StringBuilder();
context.append("基于以下文档内容回答问题:\n\n");
for (SearchResult doc : docs) {
context.append("文档片段:").append(doc.getContent())
.append("\n\n");
}
context.append("问题:").append(question).append("\n");
context.append("请根据上述文档内容提供准确的回答。");
return context.toString();
}
}
4. 会话内存管理
实现多轮对话的上下文管理:
@Service
public class ConversationMemoryService {
@Autowired
private RedisTemplate<String, Object> redisTemplate;
private static final String CONVERSATION_PREFIX = "conv:";
public void saveConversationTurn(String sessionId, String userQuery, String aiResponse) {
Conversation conversation = getOrCreateConversation(sessionId);
conversation.addTurn(new ConversationTurn(userQuery, aiResponse));
// 保存到Redis,设置过期时间
redisTemplate.opsForValue().set(
CONVERSATION_PREFIX + sessionId,
conversation,
Duration.ofHours(2)
);
}
public Conversation getConversationContext(String sessionId) {
return (Conversation) redisTemplate.opsForValue()
.get(CONVERSATION_PREFIX + sessionId);
}
private Conversation getOrCreateConversation(String sessionId) {
Conversation existing = getConversationContext(sessionId);
return existing != null ? existing : new Conversation();
}
}
@Data
class Conversation {
private List<ConversationTurn> turns = new ArrayList<>();
private LocalDateTime createdAt = LocalDateTime.now();
public void addTurn(ConversationTurn turn) {
turns.add(turn);
}
public String getContext() {
return turns.stream()
.map(turn -> "用户: " + turn.getUserQuery() + "\nAI: " + turn.getAiResponse())
.collect(Collectors.joining("\n\n"));
}
}
@Data
@AllArgsConstructor
class ConversationTurn {
private String userQuery;
private String aiResponse;
}
REST API设计
控制器实现
@RestController
@RequestMapping("/api/rag")
public class RAGController {
@Autowired
private RAGQuestionAnsweringService qaService;
@Autowired
private ConversationMemoryService memoryService;
@PostMapping("/ask")
public ResponseEntity<AnswerResponse> askQuestion(
@RequestBody QuestionRequest request,
HttpServletRequest httpRequest) {
String sessionId = extractSessionId(httpRequest);
// 获取会话上下文
Conversation context = memoryService.getConversationContext(sessionId);
String contextualQuestion = buildContextualQuestion(request.getQuestion(), context);
// 获取回答
String answer = qaService.answerQuestion(contextualQuestion);
// 保存会话记录
memoryService.saveConversationTurn(sessionId, request.getQuestion(), answer);
return ResponseEntity.ok(new AnswerResponse(answer));
}
@PostMapping("/documents")
public ResponseEntity<String> uploadDocument(
@RequestParam("file") MultipartFile file,
@RequestParam("documentId") String documentId) {
try {
String content = new String(file.getBytes(), StandardCharsets.UTF_8);
// 处理文档...
return ResponseEntity.ok("文档上传成功");
} catch (IOException e) {
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body("文档处理失败");
}
}
private String extractSessionId(HttpServletRequest request) {
// 从cookie或header中提取sessionId
return request.getSession().getId();
}
private String buildContextualQuestion(String question, Conversation context) {
if (context == null || context.getTurns().isEmpty()) {
return question;
}
return "之前的对话上下文:\n" + context.getContext() +
"\n\n当前问题:" + question;
}
}
@Data
class QuestionRequest {
private String question;
}
@Data
@AllArgsConstructor
class AnswerResponse {
private String answer;
}
性能优化策略
1. 缓存优化
@Configuration
@EnableCaching
public class CacheConfig {
@Bean
public CacheManager cacheManager() {
CaffeineCacheManager cacheManager = new CaffeineCacheManager();
cacheManager.setCaffeine(Caffeine.newBuilder()
.expireAfterWrite(10, TimeUnit.MINUTES)
.maximumSize(1000));
return cacheManager;
}
}
@Service
public class CachedEmbeddingService {
@Autowired
private EmbeddingClient embeddingClient;
@Cacheable(value = "embeddings", key = "#text")
public float[] getCachedEmbedding(String text) {
return embeddingClient.embed(text);
}
}
2. 批量处理优化
@Service
public class BatchProcessingService {
@Async
public CompletableFuture<Void> processDocumentsBatch(List<Document> documents) {
documents.parallelStream().forEach(this::processDocument);
return CompletableFuture.completedFuture(null);
}
private void processDocument(Document doc) {
// 文档处理逻辑
}
}
监控与日志
集成Micrometer监控
@Configuration
public class MonitoringConfig {
@Bean
public MeterRegistry meterRegistry() {
return new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
}
@Bean
public TimedAspect timedAspect(MeterRegistry registry) {
return new TimedAspect(registry);
}
}
@Service
public class MonitoringService {
private final Counter questionCounter;
private final Timer responseTimer;
public MonitoringService(MeterRegistry registry) {
questionCounter = registry.counter("rag.questions.total");
responseTimer = registry.timer("rag.response.time");
}
@Timed(value = "rag.process.question")
public void recordQuestionProcessing() {
questionCounter.increment();
}
}
安全考虑
API安全加固
@Configuration
@EnableWebSecurity
public class SecurityConfig {
@Bean
public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
http
.authorizeHttpRequests(authz -> authz
.requestMatchers("/api/rag/**").authenticated()
.anyRequest().permitAll())
.oauth2ResourceServer(OAuth2ResourceServerConfigurer::jwt)
.csrf().disable();
return http.build();
}
}
部署与扩展
Docker容器化部署
FROM openjdk:17-jdk-slim
WORKDIR /app
COPY target/rag-system.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
Kubernetes部署配置
apiVersion: apps/v1
kind: Deployment
metadata:
name: rag-system
spec:
replicas: 3
selector:
matchLabels:
app: rag-system
template:
metadata:
labels:
app: rag-system
spec:
containers:
- name: rag-app
image: rag-system:latest
ports:
- containerPort: 8080
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
---
apiVersion: v1
kind: Service
metadata:
name: rag-service
spec:
selector:
app: rag-system
ports:
- port: 80
targetPort: 8080
总结与展望
本文详细介绍了基于Spring AI和RAG技术构建企业级智能文档问答系统的完整方案。通过结合语义检索、向量数据库和大型语言模型,我们实现了能够理解自然语言查询并提供准确回答的智能系统。
关键优势
- 准确性高: RAG架构确保回答基于实际文档内容,减少AI幻觉
- 扩展性强: 模块化设计便于集成新的文档类型和AI模型
- 性能优异: 通过缓存和批量处理优化系统响应速度
- 企业级特性: 包含安全、监控、会话管理等生产环境必需功能
未来改进方向
- 支持多模态文档处理(图片、表格等)
- 实现更复杂的Agent工作流
- 集成更多的向量数据库选项
- 优化提示工程和模型微调策略
这套解决方案为企业提供了强大的智能文档处理能力,能够显著提升知识管理效率和用户体验。

被折叠的 条评论
为什么被折叠?



