pyAudioAnalysis-audioFeatureExtraction 错误纠正

这篇博客主要介绍了在使用pyAudioAnalysis库时遇到的三个TypeError错误:mfccInitFilterBanks函数传参过多、浮点数不能作为整数解释、numpy浮点数无法用作索引。解决方案包括删除多余参数、将浮点数转换为整数以及类型转换。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1. TypeError: mfccInitFilterBanks() takes 2 positional arguments but 7 were given

The issue


In the function stFeatureSpeed()from audioFeatureExtraction.py make a call: [fbank, freqs] = mfccInitFilterBanks(Fs, nfft, lowfreq, linsc, logsc, nlinfil, nlogfil), but this function is only able to receive two arguments. It has the following signature: def mfccInitFilterBanks(fs, nfft)

The "extra" arguments are: lowfreq, linsc, logsc, nlinfil, nlogfil

These are all defined in the mfccInitFilterBanks function itself wit exactly the same values.

Note that there is a comment in the source code (audioFeatureExtraction.py) that says that the stFeatureSpeed() function is work in progress. so there could be many (non obvious) bug

# -*- coding: utf-8 -*- import numpy as np import librosa import random def extract_power(y, sr, size=3): """ extract log mel spectrogram feature :param y: the input signal (audio time series) :param sr: sample rate of 'y' :param size: the length (seconds) of random crop from original audio, default as 3 seconds :return: log-mel spectrogram feature """ # normalization y = y.astype(np.float32) normalization_factor = 1 / np.max(np.abs(y)) y = y * normalization_factor # random crop start = random.randint(0, len(y) - size * sr) y = y[start: start + size * sr] # extract log mel spectrogram ##### powerspec = np.abs(librosa.stft(y,n_fft=128, hop_length=1024)) ** 2 #logmelspec = librosa.power_to_db(melspectrogram) return powerspec def extract_logmel(y, sr, size=3): """ extract log mel spectrogram feature :param y: the input signal (audio time series) :param sr: sample rate of 'y' :param size: the length (seconds) of random crop from original audio, default as 3 seconds :return: log-mel spectrogram feature """ # normalization y = y.astype(np.float32) normalization_factor = 1 / np.max(np.abs(y)) y = y * normalization_factor # random crop start = random.randint(0, len(y) - size * sr) y = y[start: start + size * sr] # extract log mel spectrogram ##### melspectrogram = librosa.feature.melspectrogram(y=y, sr=sr, n_fft=2048, hop_length=1024, n_mels=90) logmelspec = librosa.power_to_db(melspectrogram) return logmelspec def extract_mfcc(y, sr, size=3): """ extract MFCC feature :param y: np.ndarray [shape=(n,)], real-valued the input signal (audio time series) :param sr: sample rate of 'y' :param size: the length (seconds) of random crop from original audio, default as 3 seconds :return: MFCC feature """ # normalization y = y.astype(np.float32) normalization_factor = 1 / np.max(np.abs(y))
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值