FunASR实时语音转写:前端Vue组件开发实例
引言:实时语音转写的技术痛点与解决方案
你是否还在为构建低延迟语音转写系统而困扰?传统方案往往面临三大挑战:音频流处理复杂、WebSocket连接不稳定、多模型协同延迟。本文将基于FunASR开源工具包,通过完整Vue组件开发实例,带你从零实现工业级实时语音转写功能,读完你将掌握:
- 浏览器音频流采集与预处理
- WebSocket全双工通信实现
- 2Pass实时转写架构集成
- 前端降噪与音频分片优化
技术架构:FunASR实时转写原理
核心技术栈选型
| 模块 | 技术选型 | 优势 |
|---|---|---|
| 音频采集 | MediaRecorder API | 原生支持PCM格式输出 |
| 实时通信 | WebSocket | 低延迟双向数据传输 |
| 语音识别 | FunASR-WebSocket服务 | 集成VAD+ASR+标点预测 |
| 前端框架 | Vue 2.x + Composition API | 组件化状态管理 |
2Pass转写流程图
组件开发:从零构建语音转写组件
1. 项目初始化与依赖安装
# 创建Vue组件项目
vue create funasr-vue-demo
cd funasr-vue-demo
# 安装核心依赖
npm install webrtc-adapter # 音频设备兼容处理
npm install crypto-js # 音频数据加密
npm install ant-design-vue # UI组件库
2. 音频采集组件实现
<template>
<div class="audio-recorder">
<a-button
:loading="isRecording"
@click="toggleRecording"
type="primary"
icon="sound"
>
{{ isRecording ? '停止录音' : '开始录音' }}
</a-button>
<a-slider
v-model="volume"
:disabled="!isRecording"
class="volume-control"
/>
</div>
</template>
<script>
import { ref, onMounted, onUnmounted } from 'vue'
import adapter from 'webrtc-adapter'
export default {
name: 'AudioRecorder',
emits: ['audioChunk'],
setup(_, { emit }) {
const isRecording = ref(false)
const volume = ref(0)
const mediaRecorder = ref(null)
const audioContext = ref(null)
const analyser = ref(null)
const stream = ref(null)
// 初始化音频上下文
onMounted(() => {
audioContext.value = new (window.AudioContext || window.webkitAudioContext)()
analyser.value = audioContext.value.createAnalyser()
analyser.value.fftSize = 256
})
// 开始录音
const startRecording = async () => {
try {
stream.value = await navigator.mediaDevices.getUserMedia({
audio: {
sampleRate: 16000,
channelCount: 1,
echoCancellation: true
}
})
// 连接音量分析器
const source = audioContext.value.createMediaStreamSource(stream.value)
source.connect(analyser.value)
// 配置MediaRecorder
const options = {
mimeType: 'audio/webm;codecs=opus',
audioBitsPerSecond: 16000
}
mediaRecorder.value = new MediaRecorder(stream.value, options)
// 每600ms发送一次音频块
mediaRecorder.value.start(600)
// 监听数据可用事件
mediaRecorder.value.ondataavailable = (e) => {
if (e.data.size > 0) {
processAudioChunk(e.data)
updateVolume()
}
}
isRecording.value = true
} catch (err) {
console.error('录音初始化失败:', err)
this.$message.error('请授予麦克风权限')
}
}
// 处理音频块
const processAudioChunk = async (chunk) => {
// 转换为PCM格式
const arrayBuffer = await chunk.arrayBuffer()
const audioBuffer = await audioContext.value.decodeAudioData(arrayBuffer)
// 下采样至16kHz单声道
const downsampled = downsampleBuffer(
audioBuffer.getChannelData(0),
audioBuffer.sampleRate,
16000
)
// 发送音频块
emit('audioChunk', {
data: downsampled,
timestamp: Date.now()
})
}
// 音量监测
const updateVolume = () => {
const dataArray = new Uint8Array(analyser.value.frequencyBinCount)
analyser.value.getByteFrequencyData(dataArray)
const avg = dataArray.reduce((a, b) => a + b, 0) / dataArray.length
volume.value = Math.min(100, Math.max(0, avg * 2))
}
// 停止录音
const stopRecording = () => {
if (mediaRecorder.value && mediaRecorder.value.state !== 'inactive') {
mediaRecorder.value.stop()
stream.value.getTracks().forEach(track => track.stop())
isRecording.value = false
}
}
// 工具函数:音频下采样
const downsampleBuffer = (buffer, sampleRate, targetSampleRate) => {
if (sampleRate === targetSampleRate) return buffer
const ratio = sampleRate / targetSampleRate
const newLength = Math.round(buffer.length / ratio)
const result = new Float32Array(newLength)
let offsetResult = 0
let offsetBuffer = 0
while (offsetResult < result.length) {
const nextOffsetBuffer = Math.round((offsetResult + 1) * ratio)
let sum = 0, count = 0
for (let i = offsetBuffer; i < nextOffsetBuffer && i < buffer.length; i++) {
sum += buffer[i]
count++
}
result[offsetResult] = sum / count
offsetResult++
offsetBuffer = nextOffsetBuffer
}
return result
}
return {
isRecording,
volume,
toggleRecording: () => isRecording.value ? stopRecording() : startRecording()
}
</script>
</template>
3. WebSocket通信模块
<template>
<div class="funasr-transcriber">
<audio-recorder @audioChunk="handleAudioChunk" />
<div class="transcriptBox">
<div class="temp-result" v-if="tempResult">
{{ tempResult }}
</div>
<div class="final-result" v-for="(item, index) in finalResults" :key="index">
{{ item.text }}
</div>
</div>
</div>
</template>
<script>
import { ref, onMounted, onUnmounted, watch } from 'vue'
import AudioRecorder from './AudioRecorder.vue'
export default {
components: { AudioRecorder },
setup() {
const ws = ref(null)
const tempResult = ref('')
const finalResults = ref([])
const isConnected = ref(false)
// 连接WebSocket服务
const connectWebSocket = () => {
// 服务端地址配置
const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:'
const host = '127.0.0.1'
const port = '10095'
// 建立连接
ws.value = new WebSocket(
`${protocol}//${host}:${port}/ws?mode=2pass&chunk_size=5,10,5`
)
// 连接事件处理
ws.value.onopen = () => {
console.log('WebSocket连接成功')
isConnected.value = true
}
ws.value.onmessage = (event) => {
const data = JSON.parse(event.data)
handleServerMessage(data)
}
ws.value.onerror = (error) => {
console.error('WebSocket错误:', error)
reconnect()
}
ws.value.onclose = () => {
console.log('WebSocket连接关闭')
isConnected.value = false
if (process.env.NODE_ENV !== 'development') reconnect()
}
}
// 重连机制
const reconnect = () => {
if (!isConnected.value) {
setTimeout(connectWebSocket, 3000)
}
}
// 处理服务端消息
const handleServerMessage = (data) => {
if (data.type === 'partial_result') {
// 临时结果更新
tempResult.value = data.result
} else if (data.type === 'final_result') {
// 最终结果确认
finalResults.value.push({
text: data.result,
timestamp: data.timestamp
})
tempResult.value = ''
} else if (data.type === 'error') {
console.error('转写错误:', data.message)
}
}
// 发送音频块
const handleAudioChunk = (chunk) => {
if (ws.value && ws.value.readyState === WebSocket.OPEN) {
// 转换为Base64发送
const base64Data = btoa(String.fromCharCode(
...new Uint8Array(chunk.data.buffer)
))
ws.value.send(JSON.stringify({
type: 'audio',
data: base64Data,
timestamp: chunk.timestamp
}))
}
}
onMounted(() => connectWebSocket())
onUnmounted(() => {
if (ws.value) ws.value.close()
})
return {
tempResult,
finalResults,
handleAudioChunk
}
}
}
</script>
服务端部署与联调
Docker快速部署
# 拉取镜像
sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.10
# 启动服务
mkdir -p ./models
sudo docker run -p 10095:10095 -it --privileged=true \
-v $PWD/models:/workspace/models \
registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.10
# 启动2Pass服务
cd FunASR/runtime
nohup bash run_server_2pass.sh \
--download-model-dir /workspace/models \
--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx \
--online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx \
--punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx \
--itn-dir thuduj12/fst_itn_zh > log.txt 2>&1 &
性能优化参数配置
| 参数 | 建议值 | 说明 |
|---|---|---|
| chunk_size | "5,10,5" | 音频分片大小(前向/中间/后向) |
| sample_rate | 16000 | 采样率(必须与模型匹配) |
| audio_buffer | 0.2 | 音频缓存时长(秒) |
| vad_silence_time | 600 | 静音检测阈值(毫秒) |
常见问题与解决方案
Q1: 音频采集有回声或噪音
A: 启用浏览器内置降噪算法
// 修改MediaRecorder配置
const constraints = {
audio: {
echoCancellation: true,
noiseSuppression: true,
autoGainControl: true
}
}
Q2: WebSocket连接频繁断开
A: 实现心跳机制
// 添加心跳检测
const heartbeatInterval = setInterval(() => {
if (ws.value && ws.value.readyState === WebSocket.OPEN) {
ws.value.send(JSON.stringify({ type: 'ping' }))
}
}, 30000)
Q3: 转写延迟超过300ms
A: 优化策略组合
- 减少音频分片大小至300ms
- 启用WebWorker处理音频编码
- 服务端使用GPU加速(需部署带GPU的Docker镜像)
总结与扩展
本文通过完整实例演示了基于FunASR的前端语音转写组件开发,核心亮点包括:
- 低延迟架构:采用2Pass模式平衡实时性与准确率
- 组件化设计:音频采集与WebSocket通信解耦
- 全平台兼容:适配Chrome/Firefox/Safari等浏览器
扩展方向:
- 集成热词定制功能(通过WebSocket发送hotword参数)
- 实现多语言识别切换(修改model-dir参数)
- 添加语音情感分析(集成FunASR情感模型)
欢迎点赞收藏本教程,关注后续《FunASR模型微调实战》系列文章!
创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考



