Spring AI与RAG技术实战:构建企业级智能文档问答系统
引言
在人工智能技术飞速发展的今天,企业面临着海量文档管理和知识检索的挑战。传统的基于关键词的搜索方式已经无法满足用户对精准、智能问答的需求。Spring AI结合RAG(检索增强生成)技术为企业提供了构建智能文档问答系统的完美解决方案。本文将深入探讨如何使用Spring技术栈构建一个高效、准确的智能问答系统。
技术架构概述
核心组件
- Spring Boot 3.x: 作为基础框架提供快速开发能力
- Spring AI: 官方AI集成框架,简化AI应用开发
- 向量数据库: 使用Redis或Milvus存储文档向量
- Embedding模型: OpenAI或本地部署的Ollama模型
- RAG架构: 检索增强生成模式确保回答准确性
系统架构设计
用户请求 → Spring Web层 → 语义检索 → 向量数据库
↓
回答生成 ← LLM模型 ← 相关文档片段
环境准备与依赖配置
Maven依赖配置
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>0.8.1</version>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-redis-spring-boot-starter</artifactId>
<version>0.8.1</version>
</dependency>
</dependencies>
配置文件
spring:
ai:
openai:
api-key: ${OPENAI_API_KEY}
chat:
options:
model: gpt-3.5-turbo
temperature: 0.7
vectorstore:
redis:
uri: redis://localhost:6379
index: document_vectors
文档处理与向量化
文档加载器实现
@Component
public class DocumentLoader {
@Autowired
private EmbeddingModel embeddingModel;
@Autowired
private VectorStore vectorStore;
public void loadDocuments(List<Document> documents) {
List<Document> processedDocs = documents.stream()
.map(this::splitDocument)
.flatMap(List::stream)
.collect(Collectors.toList());
vectorStore.add(processedDocs);
}
private List<Document> splitDocument(Document doc) {
// 文档分块处理,每块500字符左右
return TextSplitter.split(doc.getContent(), 500);
}
}
向量化服务
@Service
public class EmbeddingService {
@Autowired
private EmbeddingModel embeddingModel;
public List<Double> getEmbedding(String text) {
return embeddingModel.embed(text);
}
public List<List<Double>> batchEmbedding(List<String> texts) {
return embeddingModel.embed(texts);
}
}
RAG检索增强实现
语义检索服务
@Service
public class SemanticSearchService {
@Autowired
private VectorStore vectorStore;
@Autowired
private EmbeddingModel embeddingModel;
public List<Document> searchRelevantDocuments(String query, int topK) {
List<Double> queryEmbedding = embeddingModel.embed(query);
return vectorStore.similaritySearch(
SearchRequest.defaults()
.withQueryEmbedding(queryEmbedding)
.withTopK(topK)
);
}
}
提示词工程
@Component
public class PromptTemplateManager {
private static final String RAG_PROMPT_TEMPLATE = """
基于以下上下文信息,请回答用户的问题。
上下文:
{context}
问题:{question}
要求:
1. 仅使用提供的上下文信息回答问题
2. 如果上下文信息不足以回答问题,请明确说明
3. 回答要准确、简洁
""";
public String buildRagPrompt(String question, String context) {
return RAG_PROMPT_TEMPLATE
.replace("{context}", context)
.replace("{question}", question);
}
}
智能问答服务层
核心问答服务
@Service
public class IntelligentQAService {
@Autowired
private SemanticSearchService searchService;
@Autowired
private ChatClient chatClient;
@Autowired
private PromptTemplateManager promptManager;
public String answerQuestion(String question) {
// 1. 检索相关文档
List<Document> relevantDocs = searchService.searchRelevantDocuments(question, 5);
if (relevantDocs.isEmpty()) {
return "抱歉,没有找到相关的文档信息来回答您的问题。";
}
// 2. 构建上下文
String context = buildContextFromDocuments(relevantDocs);
// 3. 构建提示词
String prompt = promptManager.buildRagPrompt(question, context);
// 4. 调用AI模型生成回答
ChatResponse response = chatClient.call(prompt);
return response.getResult().getOutput().getContent();
}
private String buildContextFromDocuments(List<Document> documents) {
return documents.stream()
.map(Document::getContent)
.collect(Collectors.joining("\n\n"));
}
}
Web控制器层
RESTful API设计
@RestController
@RequestMapping("/api/qa")
public class QAController {
@Autowired
private IntelligentQAService qaService;
@PostMapping("/ask")
public ResponseEntity<QAResponse> askQuestion(@RequestBody QARequest request) {
try {
String answer = qaService.answerQuestion(request.getQuestion());
return ResponseEntity.ok(new QAResponse(answer, "success"));
} catch (Exception e) {
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body(new QAResponse(null, "系统繁忙,请稍后重试"));
}
}
@PostMapping("/documents")
public ResponseEntity<String> uploadDocuments(@RequestBody List<DocumentUploadRequest> requests) {
// 文档上传处理逻辑
return ResponseEntity.ok("文档上传成功");
}
}
性能优化与监控
缓存策略
@Configuration
@EnableCaching
public class CacheConfig {
@Bean
public CacheManager cacheManager() {
CaffeineCacheManager cacheManager = new CaffeineCacheManager();
cacheManager.setCaffeine(Caffeine.newBuilder()
.expireAfterWrite(10, TimeUnit.MINUTES)
.maximumSize(1000));
return cacheManager;
}
}
@Service
public class CachedQAService {
@Autowired
private IntelligentQAService qaService;
@Cacheable(value = "qaCache", key = "#question")
public String getCachedAnswer(String question) {
return qaService.answerQuestion(question);
}
}
监控指标
@Component
public class QAMetrics {
private final MeterRegistry meterRegistry;
private final Counter successCounter;
private final Counter failureCounter;
private final Timer responseTimer;
public QAMetrics(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
this.successCounter = meterRegistry.counter("qa.requests.success");
this.failureCounter = meterRegistry.counter("qa.requests.failure");
this.responseTimer = meterRegistry.timer("qa.response.time");
}
public void recordSuccess(long duration) {
successCounter.increment();
responseTimer.record(duration, TimeUnit.MILLISECONDS);
}
public void recordFailure() {
failureCounter.increment();
}
}
异常处理与容错机制
全局异常处理
@ControllerAdvice
public class GlobalExceptionHandler {
@ExceptionHandler(AIServiceException.class)
public ResponseEntity<ErrorResponse> handleAIException(AIServiceException ex) {
ErrorResponse error = new ErrorResponse("AI服务异常", ex.getMessage());
return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE).body(error);
}
@ExceptionHandler(RateLimitException.class)
public ResponseEntity<ErrorResponse> handleRateLimitException(RateLimitException ex) {
ErrorResponse error = new ErrorResponse("请求频率限制", "请稍后重试");
return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS).body(error);
}
}
重试机制
@Configuration
@EnableRetry
public class RetryConfig {
@Bean
public RetryTemplate retryTemplate() {
return RetryTemplate.builder()
.maxAttempts(3)
.fixedBackoff(1000)
.retryOn(AIServiceException.class)
.build();
}
}
@Service
public class RetryableAIService {
@Autowired
private RetryTemplate retryTemplate;
@Autowired
private ChatClient chatClient;
public String callWithRetry(String prompt) {
return retryTemplate.execute(context -> {
return chatClient.call(prompt).getResult().getOutput().getContent();
});
}
}
安全考虑
API安全配置
@Configuration
@EnableWebSecurity
public class SecurityConfig {
@Bean
public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
http
.authorizeHttpRequests(authz -> authz
.requestMatchers("/api/qa/ask").authenticated()
.anyRequest().permitAll())
.oauth2ResourceServer(OAuth2ResourceServerConfigurer::jwt)
.csrf().disable();
return http.build();
}
}
输入验证
@Component
public class InputValidator {
public void validateQuestion(String question) {
if (question == null || question.trim().isEmpty()) {
throw new ValidationException("问题不能为空");
}
if (question.length() > 1000) {
throw new ValidationException("问题长度超过限制");
}
// 防止注入攻击
if (containsMaliciousContent(question)) {
throw new SecurityException("检测到恶意输入");
}
}
private boolean containsMaliciousContent(String input) {
// 简单的恶意内容检测逻辑
String[] patterns = {"<script>", "javascript:", "onerror="};
return Arrays.stream(patterns).anyMatch(input::contains);
}
}
部署与运维
Docker容器化
FROM openjdk:17-jdk-slim
WORKDIR /app
COPY target/qa-system.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
Kubernetes部署配置
apiVersion: apps/v1
kind: Deployment
metadata:
name: qa-system
spec:
replicas: 3
selector:
matchLabels:
app: qa-system
template:
metadata:
labels:
app: qa-system
spec:
containers:
- name: qa-app
image: qa-system:latest
ports:
- containerPort: 8080
env:
- name: SPRING_PROFILES_ACTIVE
value: "prod"
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
name: qa-service
spec:
selector:
app: qa-system
ports:
- port: 80
targetPort: 8080
测试策略
单元测试
@SpringBootTest
public class IntelligentQAServiceTest {
@MockBean
private SemanticSearchService searchService;
@MockBean
private ChatClient chatClient;
@Autowired
private IntelligentQAService qaService;
@Test
void testAnswerQuestionWithRelevantDocs() {
// 模拟文档检索
when(searchService.searchRelevantDocuments(anyString(), anyInt()))
.thenReturn(List.of(new Document("测试文档内容")));
// 模拟AI响应
when(chatClient.call(anyString())).thenReturn(
new ChatResponse(new ChatResult(new Output("测试回答")))
);
String answer = qaService.answerQuestion("测试问题");
assertEquals("测试回答", answer);
}
}
集成测试
@SpringBootTest(webEnvironment = WebEnvironment.RANDOM_PORT)
public class QAControllerIntegrationTest {
@LocalServerPort
private int port;
@Test
void testAskEndpoint() {
RestTemplate restTemplate = new RestTemplate();
QARequest request = new QARequest("什么是Spring AI?");
ResponseEntity<QAResponse> response = restTemplate.postForEntity(
"http://localhost:" + port + "/api/qa/ask",
request,
QAResponse.class
);
assertEquals(HttpStatus.OK, response.getStatusCode());
assertNotNull(response.getBody().getAnswer());
}
}
总结与展望
本文详细介绍了如何使用Spring AI和RAG技术构建企业级智能文档问答系统。通过结合Spring Boot的便捷性和AI技术的智能化,我们能够快速搭建一个准确、高效的问答系统。
关键优势
- 准确性高: RAG技术确保回答基于真实文档内容
- 扩展性强: 微服务架构便于水平扩展
- 维护简单: Spring生态提供完整的工具链
- 成本可控: 支持多种AI模型和向量数据库选择
未来改进方向
- 支持多模态文档处理(图片、表格等)
- 实现实时文档更新和增量索引
- 加入用户反馈学习机制
- 优化多语言支持能力
通过本文的实践,开发者可以快速掌握Spring AI与RAG技术的核心概念和实现方法,为企业智能化转型提供技术支撑。

被折叠的 条评论
为什么被折叠?



