SpringAI实战：接入Ollama大模型之旅

最新推荐文章于 2025-03-10 23:09:45 发布

purple.taro

最新推荐文章于 2025-03-10 23:09:45 发布

阅读量369

点赞数

文章标签： AI ollama

原文链接：https://mp.weixin.qq.com/s?__biz=Mzg5MzgxMTIyOQ==&mid=2247512701&idx=1&sn=4c212decca1fd8fedec4f563bc5b081b&chksm=c02bdd0ff75c5419396efc34ec57f461b6a9a8f2fe93c7a1221e01b77fc9ce02068f9b524845&scene=21#wechat_redirect

版权

引言

SpringAI已经到来，我们迎来了Java生态与大型语言模型Ollama的集成实战！

官方文档：https://spring.io/projects/spring-ai

Spring AI是AI工程的应用框架。其目标是将可移植性和模块化设计等Spring生态系统设计原则应用于AI领域，并将使用POJO作为应用程序的构建块推广到AI领域

Spring AI 提供了一套跨多个AI提供商的便携式API，不仅支持聊天、文本到图像生成以及嵌入模型等多种功能，还提供了同步和流API选项以满足不同场景的需求。它允许用户通过配置参数来访问特定的模型，使得集成变得更加灵活和便捷。

在聊天模型方面，Spring AI支持 OpenAI、Azure OpenAI、Amazon Bedrock 等主流提供商；在文本到图像生成方面，它支持 OpenAI 的 DALL-E 和 StabilityAI 等模型。

更令人惊喜的是，Spring AI 还支持在本地无GPU环境下运行AI模型，例如通过 Ollama支持的模型，为用户提供了更多选择和可能性。

快速开始

官方文档：https://docs.spring.io/spring-ai/reference/getting-started.html

Ollama

模型库

Ollama是一个在本地启动并运行大型语言模型的工具，自动下载大模型，开箱即用，Ollama也支持大量的模型库。注意：您应该至少有8 GB可用RAM来运行7B模型，16 GB来运行13B模型，32 GB来运行33B模型。

安装运行

首页点击下载：Ollama Windows(preview)，运行OllamaSetup.exe一键安装，install就好。
本地执行：ollama run gemma:2b

SpringBoot集成Ollama

以ollama-embeddings（数据向量化）和ollama-chat为例：

依赖引入

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.example</groupId>
    <artifactId>spring-ai-demo</artifactId>
    <version>1.0-SNAPSHOT</version>

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.2.3</version>
    </parent>


    <properties>
        <maven.compiler.source>17</maven.compiler.source>
        <maven.compiler.target>17</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>
    <dependencies>
        <!--spring web-->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
        </dependency>
        <dependency>
            <groupId>org.yaml</groupId>
            <artifactId>snakeyaml</artifactId>
            <version>2.2</version>
        </dependency>

    </dependencies>


    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>0.8.0</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <repositories>
        <repository>
            <id>spring-milestones</id>
            <name>Spring Milestones</name>
            <url>https://repo.spring.io/milestone</url>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
        </repository>
        <repository>
            <id>spring-snapshots</id>
            <name>Spring Snapshots</name>
            <url>https://repo.spring.io/snapshot</url>
            <releases>
                <enabled>false</enabled>
            </releases>
        </repository>
    </repositories>
</project>

配置接入

spring:
  ai:
    ollama:
      ## 默认地址无需配置
      base-url: http://localhost:11434
      embedding:
        model: gemma:2b
      chat:
        model: gemma:2b

控制器

方式一：

@RestController
public class EmbeddingController {

    private final EmbeddingClient embeddingClient;

    @Autowired
    public EmbeddingController(EmbeddingClient embeddingClient) {
        this.embeddingClient = embeddingClient;
    }

    @GetMapping("/ai/embedding")
    public Map embed(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        EmbeddingResponse embeddingResponse = this.embeddingClient.embedForResponse(List.of(message));
        return Map.of("embedding", embeddingResponse);
    }
}

方式二：

@RestController
public class ChatController {

    private final OllamaChatClient chatClient;

    @Autowired
    public ChatController(OllamaChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @GetMapping("/ai/generate")
    public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", chatClient.call(message));
    }

    @GetMapping("/ai/generateStream")
 public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        Prompt prompt = new Prompt(new UserMessage(message));
        return chatClient.stream(prompt);
    }
}

测试：