Spring AI与RAG技术实战:构建企业级智能文档问答系统
引言
随着人工智能技术的快速发展,企业对于智能化文档处理的需求日益增长。传统的文档检索方式往往效率低下,难以满足快速获取准确信息的需求。Spring AI结合RAG(Retrieval-Augmented Generation)技术,为企业提供了构建智能文档问答系统的强大工具。本文将深入探讨如何利用Spring AI框架和RAG技术构建高效的企业级文档问答系统。
技术栈概述
Spring AI框架
Spring AI是Spring生态系统中的AI集成框架,提供了统一的API来访问各种AI模型和服务。其主要特性包括:
- 模型抽象层:统一访问OpenAI、Azure OpenAI、Google AI等主流AI服务
- 提示工程支持:内置提示模板和变量替换功能
- 向量化集成:支持多种向量数据库和嵌入模型
- 工具调用标准化:提供统一的工具执行框架
RAG技术原理
RAG(检索增强生成)技术结合了信息检索和文本生成的优势:
- 检索阶段:从知识库中检索与问题相关的文档片段
- 增强阶段:将检索到的相关信息与用户问题结合
- 生成阶段:基于增强的上下文生成准确回答
系统架构设计
整体架构
用户界面层 → API网关层 → 业务逻辑层 → 数据访问层
↓
向量数据库
↓
文档存储
核心组件
1. 文档处理模块
@Component
public class DocumentProcessor {
@Autowired
private EmbeddingModel embeddingModel;
public List<DocumentChunk> processDocument(MultipartFile file) {
// 文档解析、分块、向量化
List<String> chunks = splitDocument(file);
List<Embedding> embeddings = embeddingModel.embed(chunks);
return createDocumentChunks(chunks, embeddings);
}
private List<String> splitDocument(MultipartFile file) {
// 实现文档分块逻辑
return Arrays.asList("chunk1", "chunk2", "chunk3");
}
}
2. 向量检索模块
@Service
public class VectorSearchService {
@Autowired
private VectorStore vectorStore;
@Autowired
private EmbeddingModel embeddingModel;
public List<SearchResult> searchSimilarDocuments(String query, int topK) {
Embedding queryEmbedding = embeddingModel.embed(query);
return vectorStore.similaritySearch(queryEmbedding, topK);
}
}
3. AI生成模块
@Service
public class AIGenerationService {
@Autowired
private ChatClient chatClient;
public String generateAnswer(String question, List<SearchResult> context) {
String prompt = buildRAGPrompt(question, context);
return chatClient.call(prompt);
}
private String buildRAGPrompt(String question, List<SearchResult> context) {
StringBuilder promptBuilder = new StringBuilder();
promptBuilder.append("基于以下上下文信息,请回答用户的问题:\n\n");
for (SearchResult result : context) {
promptBuilder.append("上下文:").append(result.getContent()).append("\n\n");
}
promptBuilder.append("问题:").append(question).append("\n\n");
promptBuilder.append("回答:");
return promptBuilder.toString();
}
}
实战开发步骤
1. 环境准备
Maven依赖配置
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>0.8.1</version>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
<version>0.8.1</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
</dependencies>
2. 应用配置
application.yml配置
spring:
ai:
openai:
api-key: ${OPENAI_API_KEY}
chat:
options:
model: gpt-3.5-turbo
temperature: 0.7
vectorstore:
pgvector:
enabled: true
index-type: HNSW
dimensions: 1536
datasource:
url: jdbc:postgresql://localhost:5432/vector_db
username: postgres
password: password
3. 核心业务实现
文档上传接口
@RestController
@RequestMapping("/api/documents")
public class DocumentController {
@Autowired
private DocumentService documentService;
@PostMapping("/upload")
public ResponseEntity<String> uploadDocument(
@RequestParam("file") MultipartFile file,
@RequestParam("category") String category) {
try {
documentService.processAndStoreDocument(file, category);
return ResponseEntity.ok("文档上传成功");
} catch (Exception e) {
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body("文档处理失败: " + e.getMessage());
}
}
}
问答接口
@RestController
@RequestMapping("/api/qa")
public class QAController {
@Autowired
private QAService qaService;
@PostMapping("/ask")
public ResponseEntity<AnswerResponse> askQuestion(
@RequestBody QuestionRequest request) {
try {
AnswerResponse response = qaService.answerQuestion(request.getQuestion());
return ResponseEntity.ok(response);
} catch (Exception e) {
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body(new AnswerResponse("系统繁忙,请稍后重试"));
}
}
}
4. 高级特性实现
对话记忆管理
@Component
public class ConversationMemoryManager {
private final Map<String, List<ChatMessage>> conversationMemories = new ConcurrentHashMap<>();
public void addMessage(String sessionId, ChatMessage message) {
conversationMemories
.computeIfAbsent(sessionId, k -> new ArrayList<>())
.add(message);
// 限制对话历史长度,防止token超限
if (conversationMemories.get(sessionId).size() > 10) {
conversationMemories.get(sessionId).remove(0);
}
}
public List<ChatMessage> getConversationHistory(String sessionId) {
return conversationMemories.getOrDefault(sessionId, new ArrayList<>());
}
}
智能代理(Agent)实现
@Service
public class DocumentAgent {
@Autowired
private VectorSearchService searchService;
@Autowired
private AIGenerationService generationService;
@Autowired
private ToolExecutionFramework toolFramework;
public String processQuery(String query, String sessionId) {
// 1. 检索相关文档
List<SearchResult> relevantDocs = searchService.searchSimilarDocuments(query, 5);
// 2. 判断是否需要工具调用
if (requiresToolExecution(query)) {
return toolFramework.executeTool(query, relevantDocs);
}
// 3. 生成回答
return generationService.generateAnswer(query, relevantDocs);
}
private boolean requiresToolExecution(String query) {
// 实现工具调用判断逻辑
return query.contains("计算") || query.contains("查询") || query.contains("获取");
}
}
性能优化策略
1. 向量索引优化
-- 创建高效的HNSW索引
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 200);
2. 缓存策略
@Configuration
@EnableCaching
public class CacheConfig {
@Bean
public CacheManager cacheManager() {
CaffeineCacheManager cacheManager = new CaffeineCacheManager();
cacheManager.setCaffeine(Caffeine.newBuilder()
.expireAfterWrite(30, TimeUnit.MINUTES)
.maximumSize(1000));
return cacheManager;
}
}
@Service
public class CachedSearchService {
@Autowired
private VectorSearchService searchService;
@Cacheable(value = "searchResults", key = "#query")
public List<SearchResult> searchWithCache(String query, int topK) {
return searchService.searchSimilarDocuments(query, topK);
}
}
3. 批量处理优化
@Async
public CompletableFuture<Void> batchProcessDocuments(List<MultipartFile> files) {
return CompletableFuture.runAsync(() -> {
files.parallelStream().forEach(file -> {
try {
processAndStoreDocument(file, "batch");
} catch (Exception e) {
log.error("文档处理失败: {}", file.getOriginalFilename(), e);
}
});
});
}
错误处理与监控
1. 统一异常处理
@ControllerAdvice
public class GlobalExceptionHandler {
@ExceptionHandler(AIException.class)
public ResponseEntity<ErrorResponse> handleAIException(AIException ex) {
ErrorResponse error = new ErrorResponse("AI服务异常", ex.getMessage());
return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE).body(error);
}
@ExceptionHandler(VectorStoreException.class)
public ResponseEntity<ErrorResponse> handleVectorStoreException(VectorStoreException ex) {
ErrorResponse error = new ErrorResponse("向量存储异常", ex.getMessage());
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).body(error);
}
}
2. 监控指标
@Component
public class SystemMetrics {
private final MeterRegistry meterRegistry;
public SystemMetrics(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
}
public void recordSearchLatency(long milliseconds) {
meterRegistry.timer("rag.search.latency")
.record(milliseconds, TimeUnit.MILLISECONDS);
}
public void recordGenerationLatency(long milliseconds) {
meterRegistry.timer("rag.generation.latency")
.record(milliseconds, TimeUnit.MILLISECONDS);
}
}
部署与运维
Docker容器化部署
FROM openjdk:17-jdk-slim
WORKDIR /app
COPY target/document-qa-system.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
Kubernetes部署配置
apiVersion: apps/v1
kind: Deployment
metadata:
name: document-qa-system
spec:
replicas: 3
template:
spec:
containers:
- name: app
image: document-qa-system:latest
resources:
limits:
memory: "1Gi"
cpu: "500m"
env:
- name: SPRING_AI_OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: openai-secret
key: api-key
---
apiVersion: v1
kind: Service
metadata:
name: document-qa-service
spec:
selector:
app: document-qa-system
ports:
- port: 8080
targetPort: 8080
最佳实践
1. 提示工程优化
public class OptimizedPromptTemplate {
private static final String SYSTEM_PROMPT = """
你是一个专业的文档问答助手。请基于提供的上下文信息回答问题。
如果上下文信息不足以回答问题,请如实告知用户。
回答要准确、简洁、专业。
""";
public String buildOptimizedPrompt(String question, List<SearchResult> context) {
return String.format("""
%s
上下文信息:
%s
问题:%s
请回答:
""", SYSTEM_PROMPT, formatContext(context), question);
}
private String formatContext(List<SearchResult> context) {
return context.stream()
.map(SearchResult::getContent)
.collect(Collectors.joining("\n\n"));
}
}
2. 安全性考虑
@Configuration
public class SecurityConfig {
@Bean
public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
http
.authorizeHttpRequests(authz -> authz
.requestMatchers("/api/documents/upload").hasRole("ADMIN")
.requestMatchers("/api/qa/**").permitAll()
.anyRequest().authenticated()
)
.oauth2ResourceServer(OAuth2ResourceServerConfigurer::jwt);
return http.build();
}
}
总结
本文详细介绍了如何使用Spring AI和RAG技术构建企业级智能文档问答系统。通过合理的架构设计、性能优化和错误处理,我们能够构建出高效、可靠的文档问答解决方案。关键要点包括:
- 技术选型:Spring AI提供了统一的AI集成框架
- 架构设计:模块化设计确保系统的可扩展性和维护性
- 性能优化:通过缓存、索引和批量处理提升系统性能
- 运维部署:容器化和Kubernetes部署确保系统稳定性
随着AI技术的不断发展,基于Spring AI和RAG的文档问答系统将在企业知识管理领域发挥越来越重要的作用。
890

被折叠的 条评论
为什么被折叠?



