SpringAI初体验：凑合用_spring ai-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_42277430/article/details/138345751

文章讨论了SpringAI主要提供云服务商API的封装，对于本地模型部署能力有限，仅支持Ollama和ONNX。作者期望SpringAI能像Python的transformers一样简便部署本地模型，但目前需要额外转换模型并依赖ONNXRuntime。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

先说结论：SpringAI提供的基本是对云服务商API的统一封装与调用能力，如果期待Spring提供本地模型部署与调用能力，Spring只提供Ollama指令的封装、Transformers (ONNX)两种，并且onnx也是基于ONNX Java Runtime的封装。

按照Spring官方给出的方式先创建一个项目
参考 https://spring.io/projects/spring-ai

brew tap spring-cli-projects/spring-cli
brew install spring-cli
spring boot new --from ai --name myai

搭建完成后，核心代码如下

package org.springframework.ai.openai.samples.helloworld.simple;

import org.springframework.ai.chat.ChatClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.util.Map;

@RestController
public class SimpleAiController {

	private final ChatClient chatClient;

	@Autowired
	public SimpleAiController(ChatClient chatClient) {
		this.chatClient = chatClient;
	}

	@GetMapping("/ai/simple")
	public Map<String, String> completion(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
		return Map.of("generation", chatClient.call(message));
	}
}

这个ChatClient有各个AI云服务商的实现，使用上述方式创建的项目，默认引入的是OpenAI的服务，点进OpenAiChatClient源码里看一下，顺着主逻辑链看下去OpenAiChatClient::call -> OpenAiChatClient::callWithFunctionSupport(request) -> OpenAiChatClient::doChatCompletion -> AbstractFunctionCallSupport::callWithFunctionSupport -> OpenAiChatClient::doChatCompletion -> OpenAiApi::chatCompletionEntity

    public ResponseEntity<ChatCompletion> chatCompletionEntity(ChatCompletionRequest chatRequest) {
        Assert.notNull(chatRequest, "The request body can not be null.");
        Assert.isTrue(!chatRequest.stream(), "Request must set the steam property to false.");
        return ((RestClient.RequestBodySpec)this.restClient.post().uri("/v1/chat/completions", new Object[0])).body(chatRequest).retrieve().toEntity(ChatCompletion.class);
    }

可以看到，其实这个 OpenAiChatClient 就是一个对OpenAI接口的桥接。

个人对SpringAI的期望是，可以和Python的transformers库一样，可以提供非常方便的本地模型部署与调用能力，只需要在resources目录下放入从huggingface或公司自己训练的模型，就可以像transforms库那样去调用，而非对云服务商API或Ollama的封装——毕竟调用云服务商的API这谁不能做呢？

因此按官方文档，搭到这里其实有点失望，快速浏览了一遍官方文档，找到ollama和Transformers (ONNX)两种本地大模型的部署办法。

也就是说，如果想要在SpringAI中，运行一个本地化的大模型，要么用Ollama库里有的，但显然这个库里的大模型并不满足一切场景，要么还要把模型转为onnx格式。Spring也提供了一个转换模型格式的样例

python3 -m venv venv
source ./venv/bin/activate
(venv) pip install --upgrade pip
(venv) pip install optimum onnx onnxruntime
(venv) optimum-cli export onnx --generative sentence-transformers/all-MiniLM-L6-v2 onnx-output-folder

部署个本地模型还要转换一遍，说实话属实有些麻烦了。

这里简单给一个Python调用AI模型的参照：
模型使用Helsinki-NLP/opus-mt-zh-en，是一个中文翻译英文的模型

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("models/opus-mt-zh-en")
model = AutoModelForSeq2SeqLM.from_pretrained("models/opus-mt-zh-en")
pl = pipeline("translation", model=model, tokenizer=tokenizer)

def translate(text: str):
    return pl(text)[0]['translation_text']

可以看到，调用一个本地部署的AI模型是多么容易，而Spring在使用模型的方便程度这方面目前做的，个人感觉不如Python的生态。

话不多说，为了用上SpringAI，还是要转换的，转换对象就用刚刚在Python里尝试过的Helsinki-NLP/opus-mt-zh-en
参考此文档Transformers ONNX Embeddings
转换onnx模型的指令要修改一下

optimum-cli export onnx --task text-generation --model opus-mt-zh-en opus-mt-zh-en-onnx/

转换为onnx格式后，按SpringAI的文档说明，使用transforms ONNX调用模型。