LangChain4j（7）——输出格式化

原创已于 2025-04-09 15:51:10 修改 · 1k 阅读

10 ·

CC 4.0 BY-SA版权

文章标签：

#开发语言 #人工智能 #LangChain4j

于 2025-04-09 15:49:36 首次发布

大模型开发之LangChain4j 专栏收录该内容

25 篇文章

订阅专栏

很多大模型都支持对输出的内容进行格式化，通常情况下，一般将输出内容格式化为json数据，方便进行对象的转换。

LangChain4j中集成了对某些LLM格式化输出的支持，根据官网说明，暂时只对Azure OpenAI、OpenAI、Google AI Gemini、Ollama提供支持。

LangChain4j中是通过设置JSON Schema实现对JSON格式化输出处理，其中涉及的对象包括：

JsonObjectSchema：针对Object类型，转换为{}
JsonStringSchema：针对字符串、字符类型
JsonIntegerSchema：针对int、Integer、long、Long、BigInteger等类型
JsonNumberSchema：针对float、Float、double、Double、BigDecimal等类型
JsonBooleanSchema：针对boolean、Boolean类型
JsonEnumSchema：针对枚举类型
JsonArraySchema：针对数组、集合等类型，转换为[]
JsonReferenceSchema：针对对象中出现递归的情况，比如Person类中包含一个List<Person>属性的情况
JsonAnyOfSchema： 提供对多态的对象的支持

但是，不同的模型对JSON Schema的支持也不一样，比如OpenAI仅支持rootElement为JsonObjectSchema，而Gemini还支持JsonArraySchema、JsonEnumSchema。所以在集成不同的模型时，需要查看模型的官网中对响应参数格式的要求。

本文仅测试了智谱的GLM-4-FLASH、通过OpenAI访问DeepSeekAPI、通过Ollama调用私有化部署的DeepSeek。

先说下根据测试的结果得出的个人见解吧，很多大模型对输出格式化的参数要求都不一样，直接使用JSON Schema相对复杂，直接在prompt中要求大模型返回指定json格式的数据其实更简单。

测试GLM-4-FLASH的输出格式化

通过大模型给出Java面试题，并格式化为json格式输出，数据中包含面试题的id、标题title、答案answer。

package com.renr.langchain4jnew.app;

import com.renr.langchain4jnew.constant.CommonConstants;
import dev.langchain4j.community.model.zhipu.ZhipuAiChatModel;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.request.ResponseFormat;
import dev.langchain4j.model.chat.request.ResponseFormatType;
import dev.langchain4j.model.chat.request.json.*;
import dev.langchain4j.model.chat.response.ChatResponse;

import java.time.Duration;

/**
 * @Title: 返回格式化信息
 * @Author 老任与码
 * @Date 2025-04-09 11:05
 */
public class App11 {

    public static void main(String[] args) {
        // GLM-4-Flash不支持设置响应为json格式
        ZhipuAiChatModel chatModel = ZhipuAiChatModel.builder()
                // 模型key
                .apiKey(CommonConstants.API_KEY)
                // 精确度
                .temperature(0.9)
                .model("GLM-4-Flash")
                .maxRetries(3)
                .callTimeout(Duration.ofSeconds(60))
                .connectTimeout(Duration.ofSeconds(60))
                .writeTimeout(Duration.ofSeconds(60))
                .readTimeout(Duration.ofSeconds(60))
                .build();

        // 设置每个元素的说明
        JsonSchemaElement idSchema = JsonIntegerSchema.builder()
                .description("面试题的id")
                .build();
        JsonSchemaElement titleSchema = JsonStringSchema.builder()
                .description("面试题的标题")
                .build();
        JsonSchemaElement answerSchema = JsonStringSchema.builder()
                .description("面试题的答案")
                .build();

        // 设置格式为json对象包含的属性
        JsonSchemaElement rootElement = JsonObjectSchema.builder()
                .addProperty("id", idSchema)
                .addProperty("title", titleSchema)
                .addProperty("answer", answerSchema)
                .required("id", "title", "answer")
                .build();
        // 设置格式化为数组时，数组中元素的类型
        JsonSchemaElement rootElement2 = JsonArraySchema.builder()
                .description("面试题信息")
                .items(rootElement)
                .build();


        // 设置响应格式
        ResponseFormat responseFormat = ResponseFormat.builder()
                .type(ResponseFormatType.JSON) // type can be either TEXT (default) or JSON
                .jsonSchema(JsonSchema.builder()
                        .name("Subject")
                        .rootElement(rootElement)
                        .build())
                .build();


        UserMessage userMessage = UserMessage.from("请随机生成2道java基础的面试题，要求返回面试题的题号、题目和答案");
        // 聊天时的请求对象
        ChatRequest chatRequest = ChatRequest.builder()
                // 设置响应格式
                .responseFormat(responseFormat)
                .messages(userMessage)
                .build();
        // 聊天的响应对象
        ChatResponse chatResponse = chatModel.chat(chatRequest);

        String output = chatResponse.aiMessage().text();
        System.out.println(output);
    }
}

输出

通过输出可以看到，GLM-4-Flash不支持JSON Schema指定的JSON格式。上文我们其实也提到了，LangChain4j没有提供对智谱大模型的格式化输出的支持。

但是我们可以直接要求其返回json格式：

package com.renr.langchain4jnew.app;

import com.renr.langchain4jnew.constant.CommonConstants;
import dev.langchain4j.community.model.zhipu.ZhipuAiChatModel;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.request.ResponseFormat;
import dev.langchain4j.model.chat.request.ResponseFormatType;
import dev.langchain4j.model.chat.request.json.*;
import dev.langchain4j.model.chat.response.ChatResponse;

import java.time.Duration;

/**
 * @Title: 返回格式化信息
 * @Author 老任与码
 * @Date 2025-04-09 11:05
 */
public class App11 {

    public static void main(String[] args) {
        // GLM-4-Flash不支持设置响应为json格式
        ZhipuAiChatModel chatModel = ZhipuAiChatModel.builder()
                // 模型key
                .apiKey(CommonConstants.API_KEY)
                // 精确度
                .temperature(0.9)
                .model("GLM-4-Flash")
                .maxRetries(3)
                .callTimeout(Duration.ofSeconds(60))
                .connectTimeout(Duration.ofSeconds(60))
                .writeTimeout(Duration.ofSeconds(60))
                .readTimeout(Duration.ofSeconds(60))
                .build();

        // UserMessage userMessage = UserMessage.from("请随机生成2道java基础的面试题，要求返回面试题的题号、题目和答案");

        // 给出详细的prompt
        UserMessage userMessage = UserMessage.from("请随机生成2道java基础的面试题，要求返回JSON格式数据，包括面试题的题号id、题目title和答案answer,参考JSON格式如下：[{id:1, title:xxx, answer:xxx}]");
        // 聊天时的请求对象
        ChatRequest chatRequest = ChatRequest.builder()
                // 设置响应格式
                // .responseFormat(responseFormat)
                .messages(userMessage)
                .build();
        // 聊天的响应对象
        ChatResponse chatResponse = chatModel.chat(chatRequest);

        String output = chatResponse.aiMessage().text();
        System.out.println(output);
    }
}

输出内容如下：

使用这种方式，需要我们对返回的json数据进行额外处理，比如需要手动去除起始位置的'''json和结束位置的'''。

测试通过OpenAI调用DeepSeekAPI

package com.renr.langchain4jnew.app;

import com.renr.langchain4jnew.constant.CommonConstants;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.request.ResponseFormat;
import dev.langchain4j.model.chat.request.ResponseFormatType;
import dev.langchain4j.model.chat.request.json.*;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.openai.OpenAiChatModel;

/**
 * @Title: 返回格式化信息
 * @Author 老任与码
 * @Date 2025-04-09 11:05
 */
public class App11_2 {

    public static void main(String[] args) {
        // OpenAi 只支持JsonObjectSchema
        ChatLanguageModel chatModel = OpenAiChatModel.builder()
                // deepseek的api的访问路径
                .baseUrl("https://api.deepseek.com")
                .apiKey(CommonConstants.DP_API_KEY)
                // 模型名称
                .modelName("deepseek-chat")
                // .modelName("deepseek-reasoner")
                // 请求的日志
                .logRequests(true)
                // 响应数据的日志
                .logResponses(true)
                .build();

        JsonSchemaElement idSchema = JsonIntegerSchema.builder()
                .description("面试题的id")
                .build();
        JsonSchemaElement titleSchema = JsonStringSchema.builder()
                .description("面试题的标题")
                .build();
        JsonSchemaElement answerSchema = JsonStringSchema.builder()
                .description("面试题的答案")
                .build();

        JsonSchemaElement rootElement = JsonObjectSchema.builder()
                .addProperty("id", idSchema)
                .addProperty("title", titleSchema)
                .addProperty("answer", answerSchema)
                .required("id", "title", "answer")
                .build();

        JsonSchemaElement rootElement2 = JsonArraySchema.builder()
                .description("面试题信息")
                .items(rootElement)
                .build();

        JsonSchemaElement rootElement3 = JsonObjectSchema.builder()
                .description("面试题信息")
                .addProperty("info", rootElement2)
                .required("info")
                .build();


        ResponseFormat responseFormat = ResponseFormat.builder()
                .type(ResponseFormatType.JSON) // type can be either TEXT (default) or JSON
//                .jsonSchema(JsonSchema.builder()
//                        .name("Subject")
//                        .rootElement(rootElement3)
//                        .build())
                .build();



        UserMessage userMessage = UserMessage.from("请随机生成2道java基础的面试题，要求返回面试题的题号、题目和答案");
        ChatRequest chatRequest = ChatRequest.builder()
                .responseFormat(responseFormat)
                .messages(userMessage)
                .build();

        ChatResponse chatResponse = chatModel.chat(chatRequest);

        String output = chatResponse.aiMessage().text();
        System.out.println(output);
    }
}

OpenAI只支持JsonObjectSchema，所以我们额外增加了rootElement3对象，但是DeepSeek不支持jsonSchema的设置（执行会报错），于是我们注释掉如下代码：

执行上面代码后，依旧报错：

查看DeepSeek的帮助文档，其中提到：

于是修改prompt如下：

UserMessage userMessage = UserMessage.from("请随机生成2道java基础的面试题，要求返回JSON格式数据，包括面试题的题号id、题目title和答案answer,参考JSON格式如下：[{id:1, title:xxx, answer:xxx}]");

输出如下：

格式和我们要求的并不一样。

再次修改，将如下代码注释掉后进行测试：

输出结果如下：

对比看，不在请求中指定输出json格式后，返回的数据反而更好处理。

测试Ollama调用本地化部署的DeepSeek

关于DeepSeek的本地化部署，可以参考：DeepSeek本地化部署-优快云博客

package com.renr.langchain4jnew.app;

import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.request.ResponseFormat;
import dev.langchain4j.model.chat.request.ResponseFormatType;
import dev.langchain4j.model.chat.request.json.*;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.ollama.OllamaChatModel;

import java.time.Duration;

/**
 * @Title: 返回格式化信息
 * @Author 老任与码
 * @Date 2025-04-09 11:05
 */
public class App11_3 {

    public static void main(String[] args) {


        OllamaChatModel chatModel = OllamaChatModel.builder()
                .baseUrl("http://localhost:11434")
                .modelName("deepseek-r1:1.5b")
                .timeout(Duration.ofSeconds(100))
                .temperature(1.3)
                .logRequests(true)
                .logResponses(true)
                .build();


        JsonSchemaElement idSchema = JsonIntegerSchema.builder()
                .description("面试题的id")
                .build();
        JsonSchemaElement titleSchema = JsonStringSchema.builder()
                .description("面试题的标题")
                .build();
        JsonSchemaElement answerSchema = JsonStringSchema.builder()
                .description("面试题的答案")
                .build();

        JsonSchemaElement rootElement = JsonObjectSchema.builder()
                .addProperty("id", idSchema)
                .addProperty("title", titleSchema)
                .addProperty("answer", answerSchema)
                .required("id", "title", "answer")
                .build();

        JsonSchemaElement rootElement2 = JsonArraySchema.builder()
                .description("面试题信息")
                .items(rootElement)
                .build();

        JsonSchemaElement rootElement3 = JsonObjectSchema.builder()
                .description("面试题信息")
                .addProperty("info", rootElement2)
                .required("info")
                .build();


        ResponseFormat responseFormat = ResponseFormat.builder()
                .type(ResponseFormatType.JSON)
                .jsonSchema(JsonSchema.builder()
                        .name("Subject")
                        .rootElement(rootElement3)
                        .build())
                .build();


        UserMessage userMessage = UserMessage.from("请随机生成2道java基础的面试题，要求返回面试题的题号、题目和答案");
        ChatRequest chatRequest = ChatRequest.builder()
                .responseFormat(responseFormat)
                .messages(userMessage)
                .build();

        ChatResponse chatResponse = chatModel.chat(chatRequest);

        String output = chatResponse.aiMessage().text();
        System.out.println(output);
    }
}

输出结果：

{
	"info": [{
		"id": 50,
		"title": "What is Java? Please write a short answer.",
		"answer": "Java is an object-oriented programming language that allows objects to be used in place of data types like String and int. It's known for its support of generics and extensibility."
	}, {
		"id": 48,
		"title": "Implementing a Linked List in Java...",
		"answer": "To implement a linked list in Java, we can create a class with nodes that hold integer data values. Each node will have pointers to the previous and next nodes."
	}]
}