手把手教你实现类ChatGPT的AI流式输出效果

Maolei_NyaRu_

于 2025-06-30 10:35:45 发布

阅读量709

点赞数 24

CC 4.0 BY-SA版权

文章标签： chatgpt 人工智能

本文链接：https://blog.youkuaiyun.com/Maolei_NyaRu_/article/details/149017663

手把手教你实现类ChatGPT的AI流式输出效果

作为一名技术博主，我一直关注AI领域的最新发展。今天，我将分享如何实现类似ChatGPT那样的AI流式输出效果，让你的AI应用更加生动有趣！

效果展示

前言

大家好，和大家分享一个非常酷的技术实现 —— AI的流式输出。相信大家在使用ChatGPT时都被它那种一个字一个字"打出来"的效果所吸引，这种流式输出不仅提升了用户体验，还让AI回答看起来更加自然。

本文将从前后端两个方面详细讲解如何实现这一效果，并提供完整的代码示例。本教程参考了阿里云文档"阿里云ModelStudio DeepSeek API"，并结合实际项目进行了优化。

什么是流式输出？

流式输出（Streaming Output）是指服务器不等待完整的响应生成完毕，而是在生成过程中逐步将数据发送给客户端的技术。这种方式有以下优点：

提升用户体验：用户无需等待完整响应，可以立即看到部分结果
降低感知延迟：即使完整响应需要较长时间，用户也能感受到系统在持续工作
模拟人类打字效果：让AI回答看起来更加自然，增强拟人感

技术实现

后端实现（Spring Boot + 阿里云百炼API）

在后端，我们使用Spring WebFlux结合阿里云百炼API来实现真实的AI流式响应。Spring WebFlux是Spring Framework的响应式Web框架，它基于Reactor库，支持非阻塞式的响应式编程。

以下是完整的核心代码：

package com.aso.aistreamoutput;

import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import reactor.adapter.rxjava.RxJava2Adapter;
import reactor.core.publisher.Flux;

import java.util.Arrays;

@RestController
@RequestMapping("/callStream")
public class Controller {
    public GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // 若没有配置环境变量，请用阿里云百炼API Key将下行替换为：.apiKey("sk-xxx")
                .apiKey("sk-xxx")
                .model("deepseek-r1")
                .messages(Arrays.asList(userMsg))
                // 不可以设置为"text"
                .resultFormat(GenerationParam.ResultFormat.MESSAGE)
                .incrementalOutput(true)
                .build();
    }
    public  Flowable<GenerationResult> streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        Flowable<GenerationResult> result = gen.streamCall(param);
        return result;
    }

    @PostMapping
    public Flux<String> getReply(@RequestBody String prompt) throws NoApiKeyException, InputRequiredException {
        Generation gen = new Generation();
        Message userMsg = Message.builder().role(Role.USER.getValue()).content(prompt).build();
        Flowable<GenerationResult> message = streamCallWithMessage(gen, userMsg);


        // 将 RxJava Flowable 转换为 Reactor Flux
        return RxJava2Adapter.flowableToFlux(message)
                .map(generationResult -> {
                    // 提取 content 内容
                    return generationResult.getOutput()
                            .getChoices()
                            .get(0)
                            .getMessage()
                            .getContent();
                })
                .filter(content -> !content.isEmpty()); // 过滤空内容
    }
}

这段代码的关键点：

阿里云百炼API集成：使用阿里云的百炼API（DashScope）进行AI文本生成
参数配置：通过buildGenerationParam方法配置API调用参数，包括API密钥、模型选择和增量输出设置
流式调用：使用streamCall方法实现流式API调用，返回RxJava的Flowable对象
响应式转换：将RxJava的Flowable转换为Spring WebFlux的Flux，实现框架兼容
内容提取：从API响应中提取实际的文本内容
过滤处理：过滤掉空内容，确保前端接收到有意义的数据

这种实现方式的优势在于：

使用真实的AI大模型（deepseek-r1）生成高质量回复
通过incrementalOutput(true)启用增量输出，实现真正的流式响应
结合Spring WebFlux的响应式编程模型，高效处理并发请求

配置阿里云百炼API

在使用上述代码之前，你需要进行以下准备工作：

获取API密钥
- 访问阿里云百炼
- 注册/登录阿里云账号
- 在控制台中创建API密钥（API Key）
- 将获得的API密钥替换代码中的sk-xxx

添加Maven依赖
在pom.xml中添加以下依赖：

<!-- 阿里云百炼SDK -->
<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>dashscope-sdk-java</artifactId>
    <version>2.8.5</version>
</dependency>

<!-- Spring WebFlux -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-webflux</artifactId>
</dependency>

<!-- RxJava2 适配器 -->
<dependency>
    <groupId>io.projectreactor.addons</groupId>
    <artifactId>reactor-adapter</artifactId>
</dependency>

项目源码

https://gitee.com/etilic/AiStreamOutput.git

常见问题解决

API密钥配置
- 推荐使用环境变量配置API密钥
- 在系统环境变量中添加DASHSCOPE_API_KEY
- 或者在代码中直接设置：.apiKey("你的API密钥")

跨域问题
如果遇到跨域问题，可以添加以下配置类：

@Configuration
public class WebConfig implements WebMvcConfigurer {
    @Override
    public void addCorsMappings(CorsRegistry registry) {
        registry.addMapping("/**")
                .allowedOrigins("http://localhost:5173")
                .allowedMethods("GET", "POST", "PUT", "DELETE", "OPTIONS")
                .allowedHeaders("*")
                .allowCredentials(true);
    }
}

流式响应超时
如果遇到响应超时问题，可以在application.properties中配置：

spring.webflux.base-path=/
spring.webflux.static-path-pattern=/**
spring.webflux.timeout=30000

前端实现（Vue 3）

前端使用Vue 3和Axios来处理流式响应。以下是核心代码：

// ChatDialog.vue
<script setup>
import { ref, onMounted } from 'vue';
import axios from 'axios';

const inputMessage = ref('');
const messages = ref([]);
const loading = ref(false);

const sendMessage = async () => {
  if (!inputMessage.value.trim()) return;
  
  // 添加用户消息
  const userMessage = inputMessage.value;
  messages.value.push({
    content: userMessage,
    isUser: true
  });
  
  // 添加AI消息占位
  const aiMessageIndex = messages.value.length;
  messages.value.push({
    content: '',
    isUser: false
  });
  
  inputMessage.value = '';
  loading.value = true;
  
  try {
    // 发送请求并处理流式响应
    const response = await axios.post('/callStream', {
      message: userMessage
    }, {
      responseType: 'text',
      onDownloadProgress: (progressEvent) => {
        // 获取已接收的数据
        const responseText = progressEvent.currentTarget.response;
        // 更新AI消息内容
        messages.value[aiMessageIndex].content = responseText;
      }
    });
    
    // 确保最终内容已更新
    messages.value[aiMessageIndex].content = response.data;
  } catch (error) {
    console.error('Error:', error);
    messages.value[aiMessageIndex].content = '抱歉，发生了错误，请稍后再试。';
  } finally {
    loading.value = false;
  }
};
</script>

这段代码的关键点：

使用ref()创建响应式变量存储消息和状态
发送消息时，先添加用户消息和AI消息占位
使用Axios发送POST请求，设置responseType: 'text'
关键是onDownloadProgress回调函数，它会在接收到新数据时触发
在回调中更新AI消息内容，实现流式显示效果

前端模板部分

<template>
  <div class="chat-container">
    <div class="messages-container" ref="messagesContainer">
      <div v-for="(msg, index) in messages" :key="index" 
           :class="['message', msg.isUser ? 'user-message' : 'ai-message']">
        <div class="message-content">{{ msg.content }}</div>
      </div>
    </div>
    
    <div class="input-container">
      <el-input 
        v-model="inputMessage" 
        placeholder="请输入您的问题..." 
        :disabled="loading"
        @keyup.enter="sendMessage"
      />
      <el-button 
        type="primary" 
        @click="sendMessage" 
        :loading="loading"
      >
        发送
      </el-button>
    </div>
  </div>
</template>

配置代理解决跨域问题

在前端开发中，我们通常会遇到跨域问题。使用Vite作为构建工具时，可以通过配置代理来解决：

// vite.config.js
export default defineConfig({
  // ...其他配置
  server: {
    proxy: {
      '/callStream': {
        target: 'http://localhost:8080',
        changeOrigin: true,
        secure: false
      }
    }
  }
})