下面我将介绍如何在Spring Boot应用中调用本地部署的Ollama中的多个模型,包括设置依赖、配置和调用代码。
1、添加依赖
首先,在 pom.xml 中添加必要的依赖:
<dependencies>
<!-- Spring Boot Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- HTTP Client -->
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.13</version>
</dependency>
<!-- JSON Processing -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<!-- Lombok (可选) -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
</dependencies>
2. 配置Ollama连接
在application.properties或application.yml中添加Ollama配置:
# application.properties
ollama.base-url=http://localhost:11434
ollama.models=llama2,gemma,mistral
ollama.timeout=30000
或者YAML格式:
# application.yml
ollama:
base-url: http://localhost:11434
models: llama2,gemma,mistral
timeout: 30000
3. 创建配置类
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
@Configuration
public class OllamaConfig {
@Value("${ollama.base-url}")
private String baseUrl;
@Value("${ollama.models}")
private String[] models;
@Value("${ollama.timeout}")
private int timeout;
@Bean
public CloseableHttpClient httpClient() {
return HttpClients.createDefault();
}
// Getter methods
public String getBaseUrl() {
return baseUrl;
}
public String[] getModels() {
return models;
}
public int getTimeout() {
return timeout;
}
}
4. 创建请求和响应DTO
import lombok.Data;
@Data
public class OllamaRequest {
private String model;
private String prompt;
private boolean stream = false;
// 可以根据需要添加其他参数
}
@Data
public class OllamaResponse {
private String model;
private String response;
// 可以根据实际响应结构添加更多字段
}
5. 创建服务类调用Ollama API
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.util.EntityUtils;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
@Service
public class OllamaService {
private final CloseableHttpClient httpClient;
private final OllamaConfig ollamaConfig;
private final ObjectMapper objectMapper;
@Autowired
public OllamaService(CloseableHttpClient httpClient, OllamaConfig ollamaConfig, ObjectMapper objectMapper) {
this.httpClient = httpClient;
this.ollamaConfig = ollamaConfig;
this.objectMapper = objectMapper;
}
public String generateText(String model, String prompt) throws IOException {
String url = ollamaConfig.getBaseUrl() + "/api/generate";
OllamaRequest request = new OllamaRequest();
request.setModel(model);
request.setPrompt(prompt);
HttpPost httpPost = new HttpPost(url);
httpPost.setHeader("Content-Type", "application/json");
httpPost.setEntity(new StringEntity(objectMapper.writeValueAsString(request)));
try (CloseableHttpResponse response = httpClient.execute(httpPost)) {
String responseBody = EntityUtils.toString(response.getEntity());
OllamaResponse ollamaResponse = objectMapper.readValue(responseBody, OllamaResponse.class);
return ollamaResponse.getResponse();
}
}
public Map<String, String> generateWithAllModels(String prompt) {
Map<String, String> responses = new HashMap<>();
for (String model : ollamaConfig.getModels()) {
try {
String response = generateText(model, prompt);
responses.put(model, response);
} catch (IOException e) {
responses.put(model, "Error calling model: " + e.getMessage());
}
}
return responses;
}
}
6. 创建控制器
import org.springframework.web.bind.annotation.*;
import java.util.Map;
@RestController
@RequestMapping("/api/ollama")
public class OllamaController {
private final OllamaService ollamaService;
public OllamaController(OllamaService ollamaService) {
this.ollamaService = ollamaService;
}
@PostMapping("/generate")
public String generateText(@RequestParam String model, @RequestParam String prompt) {
try {
return ollamaService.generateText(model, prompt);
} catch (Exception e) {
return "Error: " + e.getMessage();
}
}
@PostMapping("/generate-all")
public Map<String, String> generateWithAllModels(@RequestParam String prompt) {
return ollamaService.generateWithAllModels(prompt);
}
}
7. 使用示例
启动Spring Boot应用后,你可以通过以下方式调用:
(1)调用单个模型:
POST /api/ollama/generate?model=llama2&prompt=Tell me about AI
(2)调用所有配置的模型:
POST /api/ollama/generate-all?prompt=Explain quantum computing
8. 高级功能扩展
如果需要更复杂的功能,可以考虑:
(1)添加模型切换策略
(2)实现流式响应
(3)添加模型健康检查
(4)实现模型性能监控
例如,添加模型健康检查:
public boolean isModelAvailable(String model) throws IOException {
String url = ollamaConfig.getBaseUrl() + "/api/tags";
HttpGet httpGet = new HttpGet(url);
try (CloseableHttpResponse response = httpClient.execute(httpGet)) {
String responseBody = EntityUtils.toString(response.getEntity());
// 解析响应检查模型是否存在
return responseBody.contains(model);
}
}
注意事项
1、确保本地Ollama服务已启动并运行在默认端口(11434)
2、根据实际Ollama API版本调整请求和响应结构
3、在生产环境中添加适当的错误处理和日志记录
4、考虑添加请求超时和重试机制
以上代码提供了一个完整的Spring Boot集成Ollama多个模型的实现方案,大家可以根据实际需求进行调整和扩展。