背景:frcrn模型的语音去噪报错。
代码:
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
#frcrn
ans = pipeline(
Tasks.acoustic_noise_suppression,
model='damo/speech_frcrn_ans_cirm_16k')
result = ans(
"/home/.cache/1732609719_audio2.wav",
output_path='./output.mp3')
报错内容:
from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks ans
= pipeline( Tasks.acoustic_noise_suppression, model='damo/speech_frcrn_ans_cirm_16k')
result = ans( '/home/qdm/.cache/1732609719_audio.mp3', output_path='./output.mp3') 报错内
容:Downloading Model to directory:
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k 2024-11-26 16:35:58,566
- modelscope - WARNING - Model revision not specified, use revision: v1.0.2 2024-11-26
16:35:58,940 - modelscope - INFO - initiate model from
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k 2024-11-26 16:35:58,940 -
modelscope - INFO - initiate model from location
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k. 2024-11-26 16:35:58,940 -
modelscope - INFO - initialize model from
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k 2024-11-26 16:35:59,469 -
modelscope - WARNING - No preprocessor field found in cfg. 2024-11-26 16:35:59,469 -
modelscope - WARNING - No val key and type key found in preprocessor domain of
configuration.json file. 2024-11-26 16:35:59,469 - modelscope - WARNING - Cannot find
available config to build preprocessor at mode inference, current config: {'model_dir':
'/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k'}. trying to build by task
and model information. 2024-11-26 16:35:59,469 - modelscope - WARNING - No preprocessor
key ('speech_frcrn_ans_cirm_16k', 'acoustic-noise-suppression') found in
PREPROCESSOR_MAP, skip building preprocessor. Traceback (most recent call last): File
"/home/qdm/emo_rec/../frcrn/main.py", line 8, in <module> result = ans( File
"/root/anaconda3/envs/speech/lib/python3.10/site-packages/modelscope/pipelines/base.py",
line 220, in __call__ output = self._process_single(input, *args, **kwargs) File
"/root/anaconda3/envs/speech/lib/python3.10/site-packages/modelscope/pipelines/base.py",
line 248, in _process_single out = self.preprocess(input, **preprocess_params) File
"/root/anaconda3/envs/speech/lib/python3.10/site-
packages/modelscope/pipelines/audio/ans_pipeline.py", line 48, in preprocess data1, fs =
sf.read(io.BytesIO(file_bytes)) File "/root/anaconda3/envs/speech/lib/python3.10/site-
packages/soundfile.py", line 285, in read with SoundFile(file, 'r', samplerate, channels,
File "/root/anaconda3/envs/speech/lib/python3.10/site-packages/soundfile.py", line 658,
in __init__ self._file = self._open(file, mode_int, closefd) File
"/root/anaconda3/envs/speech/lib/python3.10/site-packages/soundfile.py", line 1216, in
_open raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening <_io.BytesIO object at 0x7b6d31be4db0>: Format
not recognised.
报错原因和解决办法:
百度出来的大概意思就是,问题出在 soundfile
库无法识别压缩后的 MP3 文件的格式。因为 soundfile
主要支持的是无损音频格式(如 WAV、FLAC),对于有损压缩格式(如 MP3)的支持有限。
解决办法就是用ffmpeg直接转为无损格式的文件,然后再识别,
安装ffmpeg:
sudo apt install ffmpeg
ffmpeg示例:
ffmpeg -i input.mp3 -acodec pcm_s16le -ar 16000 output.wav
那么代码可以直接修改为:
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
import subprocess
input_file = "/home/.cache/1732609719_audio.mp3"
output_file = "/home/.cache/1732609719_audio2.wav" # 转换为WAV格式
# 构建 ffmpeg 命令,确保转换为无损的PCM格式WAV,采样率为16kHz,单声道
command = [
"ffmpeg",
"-i", input_file,
"-acodec", "pcm_s16le", # 无损PCM编码
"-ar", "16000", # 采样率16kHz
"-ac", "1", # 单声道
output_file
]
# 执行命令
result = subprocess.run(command, capture_output=True, text=True)
# 输出ffmpeg命令执行的结果
if result.returncode == 0:
print(f"Audio successfully converted to: {output_file}")
else:
print(f"Error occurred during conversion: {result.stderr}")
# 如果FFmpeg转换失败,终止后续步骤
exit(1)
# 使用FRCRN模型进行语音增强
ans = pipeline(
Tasks.acoustic_noise_suppression,
model='damo/speech_frcrn_ans_cirm_16k')
# 确认输出路径和文件
result = ans(
output_file,
output_path='./output.wav') # 更改输出格式为WAV
print("Noise suppression complete, output saved to './output.wav'")
参考链接: