frcrn语音增强模型（语音去噪）报错Error opening ＜_io.BytesIO object at 0x7b6d31be4db0＞: Format not recognised.

weixin_51923349

已于 2024-11-27 13:06:31 修改

阅读量454

点赞数 2

文章标签： python linux 开发语言

于 2024-11-26 17:37:18 首次发布

本文链接：https://blog.youkuaiyun.com/weixin_51923349/article/details/144064304

版权

背景：frcrn模型的语音去噪报错。

代码：

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks


#frcrn
ans = pipeline(
    Tasks.acoustic_noise_suppression,
    model='damo/speech_frcrn_ans_cirm_16k')
result = ans(
    "/home/.cache/1732609719_audio2.wav",
    output_path='./output.mp3')

报错内容：

from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks ans 
= pipeline( Tasks.acoustic_noise_suppression, model='damo/speech_frcrn_ans_cirm_16k') 
result = ans( '/home/qdm/.cache/1732609719_audio.mp3', output_path='./output.mp3') 报错内
容：Downloading Model to directory: 
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k 2024-11-26 16:35:58,566 
- modelscope - WARNING - Model revision not specified, use revision: v1.0.2 2024-11-26 
16:35:58,940 - modelscope - INFO - initiate model from 
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k 2024-11-26 16:35:58,940 - 
modelscope - INFO - initiate model from location 
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k. 2024-11-26 16:35:58,940 - 
modelscope - INFO - initialize model from 
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k 2024-11-26 16:35:59,469 - 
modelscope - WARNING - No preprocessor field found in cfg. 2024-11-26 16:35:59,469 - 
modelscope - WARNING - No val key and type key found in preprocessor domain of 
configuration.json file. 2024-11-26 16:35:59,469 - modelscope - WARNING - Cannot find 
available config to build preprocessor at mode inference, current config: {'model_dir': 
'/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k'}. trying to build by task 
and model information. 2024-11-26 16:35:59,469 - modelscope - WARNING - No preprocessor 
key ('speech_frcrn_ans_cirm_16k', 'acoustic-noise-suppression') found in 
PREPROCESSOR_MAP, skip building preprocessor. Traceback (most recent call last): File 
"/home/qdm/emo_rec/../frcrn/main.py", line 8, in <module> result = ans( File 
"/root/anaconda3/envs/speech/lib/python3.10/site-packages/modelscope/pipelines/base.py", 
line 220, in __call__ output = self._process_single(input, *args, **kwargs) File 
"/root/anaconda3/envs/speech/lib/python3.10/site-packages/modelscope/pipelines/base.py", 
line 248, in _process_single out = self.preprocess(input, **preprocess_params) File 
"/root/anaconda3/envs/speech/lib/python3.10/site-
packages/modelscope/pipelines/audio/ans_pipeline.py", line 48, in preprocess data1, fs = 
sf.read(io.BytesIO(file_bytes)) File "/root/anaconda3/envs/speech/lib/python3.10/site-
packages/soundfile.py", line 285, in read with SoundFile(file, 'r', samplerate, channels, 
File "/root/anaconda3/envs/speech/lib/python3.10/site-packages/soundfile.py", line 658, 
in __init__ self._file = self._open(file, mode_int, closefd) File 
"/root/anaconda3/envs/speech/lib/python3.10/site-packages/soundfile.py", line 1216, in 
_open raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name)) 
soundfile.LibsndfileError: Error opening <_io.BytesIO object at 0x7b6d31be4db0>: Format 
not recognised.

报错原因和解决办法：

百度出来的大概意思就是，问题出在 soundfile 库无法识别压缩后的 MP3 文件的格式。因为 soundfile 主要支持的是无损音频格式（如 WAV、FLAC），对于有损压缩格式（如 MP3）的支持有限。

解决办法就是用ffmpeg直接转为无损格式的文件，然后再识别，

安装ffmpeg:

sudo apt install ffmpeg

ffmpeg示例：

ffmpeg -i input.mp3 -acodec pcm_s16le -ar 16000 output.wav

那么代码可以直接修改为：

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
import subprocess

input_file = "/home/.cache/1732609719_audio.mp3"
output_file = "/home/.cache/1732609719_audio2.wav"  # 转换为WAV格式

# 构建 ffmpeg 命令，确保转换为无损的PCM格式WAV，采样率为16kHz，单声道
command = [
    "ffmpeg", 
    "-i", input_file, 
    "-acodec", "pcm_s16le",  # 无损PCM编码
    "-ar", "16000",          # 采样率16kHz
    "-ac", "1",              # 单声道
    output_file
]

# 执行命令
result = subprocess.run(command, capture_output=True, text=True)

# 输出ffmpeg命令执行的结果
if result.returncode == 0:
    print(f"Audio successfully converted to: {output_file}")
else:
    print(f"Error occurred during conversion: {result.stderr}")
    # 如果FFmpeg转换失败，终止后续步骤
    exit(1)

# 使用FRCRN模型进行语音增强
ans = pipeline(
    Tasks.acoustic_noise_suppression,
    model='damo/speech_frcrn_ans_cirm_16k')

# 确认输出路径和文件
result = ans(
    output_file,
    output_path='./output.wav')  # 更改输出格式为WAV

print("Noise suppression complete, output saved to './output.wav'")

参考链接：

魔搭社区