frcrn语音增强模型(语音去噪)报错Error opening <_io.BytesIO object at 0x7b6d31be4db0>: Format not recognised.

背景:frcrn模型的语音去噪报错。

代码:

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks


#frcrn
ans = pipeline(
    Tasks.acoustic_noise_suppression,
    model='damo/speech_frcrn_ans_cirm_16k')
result = ans(
    "/home/.cache/1732609719_audio2.wav",
    output_path='./output.mp3')

报错内容:

from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks ans 
= pipeline( Tasks.acoustic_noise_suppression, model='damo/speech_frcrn_ans_cirm_16k') 
result = ans( '/home/qdm/.cache/1732609719_audio.mp3', output_path='./output.mp3') 报错内
容:Downloading Model to directory: 
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k 2024-11-26 16:35:58,566 
- modelscope - WARNING - Model revision not specified, use revision: v1.0.2 2024-11-26 
16:35:58,940 - modelscope - INFO - initiate model from 
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k 2024-11-26 16:35:58,940 - 
modelscope - INFO - initiate model from location 
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k. 2024-11-26 16:35:58,940 - 
modelscope - INFO - initialize model from 
/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k 2024-11-26 16:35:59,469 - 
modelscope - WARNING - No preprocessor field found in cfg. 2024-11-26 16:35:59,469 - 
modelscope - WARNING - No val key and type key found in preprocessor domain of 
configuration.json file. 2024-11-26 16:35:59,469 - modelscope - WARNING - Cannot find 
available config to build preprocessor at mode inference, current config: {'model_dir': 
'/root/.cache/modelscope/hub/damo/speech_frcrn_ans_cirm_16k'}. trying to build by task 
and model information. 2024-11-26 16:35:59,469 - modelscope - WARNING - No preprocessor 
key ('speech_frcrn_ans_cirm_16k', 'acoustic-noise-suppression') found in 
PREPROCESSOR_MAP, skip building preprocessor. Traceback (most recent call last): File 
"/home/qdm/emo_rec/../frcrn/main.py", line 8, in <module> result = ans( File 
"/root/anaconda3/envs/speech/lib/python3.10/site-packages/modelscope/pipelines/base.py", 
line 220, in __call__ output = self._process_single(input, *args, **kwargs) File 
"/root/anaconda3/envs/speech/lib/python3.10/site-packages/modelscope/pipelines/base.py", 
line 248, in _process_single out = self.preprocess(input, **preprocess_params) File 
"/root/anaconda3/envs/speech/lib/python3.10/site-
packages/modelscope/pipelines/audio/ans_pipeline.py", line 48, in preprocess data1, fs = 
sf.read(io.BytesIO(file_bytes)) File "/root/anaconda3/envs/speech/lib/python3.10/site-
packages/soundfile.py", line 285, in read with SoundFile(file, 'r', samplerate, channels, 
File "/root/anaconda3/envs/speech/lib/python3.10/site-packages/soundfile.py", line 658, 
in __init__ self._file = self._open(file, mode_int, closefd) File 
"/root/anaconda3/envs/speech/lib/python3.10/site-packages/soundfile.py", line 1216, in 
_open raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name)) 
soundfile.LibsndfileError: Error opening <_io.BytesIO object at 0x7b6d31be4db0>: Format 
not recognised.

报错原因和解决办法:

百度出来的大概意思就是,问题出在 soundfile 库无法识别压缩后的 MP3 文件的格式。因为 soundfile 主要支持的是无损音频格式(如 WAV、FLAC),对于有损压缩格式(如 MP3)的支持有限。

解决办法就是用ffmpeg直接转为无损格式的文件,然后再识别,

安装ffmpeg:

sudo apt install ffmpeg

ffmpeg示例:

ffmpeg -i input.mp3 -acodec pcm_s16le -ar 16000 output.wav

那么代码可以直接修改为:

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
import subprocess

input_file = "/home/.cache/1732609719_audio.mp3"
output_file = "/home/.cache/1732609719_audio2.wav"  # 转换为WAV格式

# 构建 ffmpeg 命令,确保转换为无损的PCM格式WAV,采样率为16kHz,单声道
command = [
    "ffmpeg", 
    "-i", input_file, 
    "-acodec", "pcm_s16le",  # 无损PCM编码
    "-ar", "16000",          # 采样率16kHz
    "-ac", "1",              # 单声道
    output_file
]

# 执行命令
result = subprocess.run(command, capture_output=True, text=True)

# 输出ffmpeg命令执行的结果
if result.returncode == 0:
    print(f"Audio successfully converted to: {output_file}")
else:
    print(f"Error occurred during conversion: {result.stderr}")
    # 如果FFmpeg转换失败,终止后续步骤
    exit(1)

# 使用FRCRN模型进行语音增强
ans = pipeline(
    Tasks.acoustic_noise_suppression,
    model='damo/speech_frcrn_ans_cirm_16k')

# 确认输出路径和文件
result = ans(
    output_file,
    output_path='./output.wav')  # 更改输出格式为WAV

print("Noise suppression complete, output saved to './output.wav'")

参考链接:

魔搭社区

soundfile.LibsndfileError: Error opening 'F:\\最终数据集\\data\\Bowhead\\20.wav': System error是由于系统错误导致的。根据引用中的解决方案,这个问题可能是由于FFmpeg没有完全安装所致。你可以尝试重新安装FFmpeg并确保它完全安装。引用中的错误提示说.wav文件包含未知格式的数据,这可能是由于文件损坏或者不兼容的音频格式导致的。你可以尝试使用其他音频文件或者使用音频编辑软件来转换文件格式。如果问题仍未解决,你可以参考引用中的博客文章中提到的解决方案,可能会有更多的帮助。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *2* [librosa加载wav文件报错.wav‘: File contains data in an unknown format.](https://blog.csdn.net/qq_41982466/article/details/121658876)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] - *3* [解决报错soundfile.LibsndfileError: Error opening.wav‘: File contains data in an unknown format...](https://blog.csdn.net/QH2107/article/details/127512901)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值