pyworld 部分 api 介绍

nia_wish

于 2020-08-11 10:54:26 发布

阅读量3.2k

点赞数

分类专栏：音频处理

本文链接：https://blog.youkuaiyun.com/m0_43395719/article/details/107930075

版权

读取音频

import librosa
import pyworld
sound, _ = librosa.load(wav_path, sr=16000)
print(f'sound.shape = {sound.shape}') #sound.shape = (80000,)

提取基频F0

sr = 16000
#输入sound 需要为 double类型 librosa load 的waveform 是 float32
print(f'sound.dtype = {sound.dtype}') # sound.dtype = float32
sound = sound.astype(np.double)

#第一种
_f0, t = pw.dio(sound, sr)    # raw pitch extractor
f0 = pw.stonemask(sound, _f0, t, sr)  # pitch refinement
#第二种
f0, timeaxis = pyworld.harvest(sound, sr)

print(f'f0.shape = {f0.shape}') # f0.shape = (1001,)

基频维度计算
源码

#python
f0_length = GetSamplesForHarvest(fs, x_length, option.frame_period)
#c++
int GetSamplesForHarvest(int fs, int x_length, double frame_period) {
  return static_cast<int>(1000.0 * x_length / fs / frame_period) + 1;
}