FFmpeg API 之 AVSampleFormat

最新推荐文章于 2024-04-26 09:41:15 发布

H&A

最新推荐文章于 2024-04-26 09:41:15 发布

阅读量2.1k

点赞数 2

分类专栏： FFmpeg API 详解

本文链接：https://blog.youkuaiyun.com/qq_34305316/article/details/106355824

版权

FFmpeg API 详解专栏收录该内容

12 篇文章

订阅专栏

音频采样格式所描述的数据总是使用本地字节序。因此，采样数据可以用本地C语言类型来表示。虽然有符号的32位采样格式是一种非常常见的原始音频数据格式，但由于C语言中没有对应的类型，因此在FFmpeg中也就没有这种采样格式。
浮点数类型的音频采样格式基于如下设定：全部音量的取值范围为 [-1.0~1.0] ，它表示了完整的音量范围。任何超出这个范围的值都在音量可取范围之外。
音频采样数据的布局

对于 planar 采样格式来说，每个通道都有自己独立的数据缓冲区，通常称为一个平面（planar），和表示该缓冲区大小的 linesize 变量，其单位是 byte，对于所有平面来说，其大小必须是一样的。

对于 packed 采样格式来说，仅有一个缓冲区以及表示其大小的linesize变量。每次采样，都按照通道顺序依次将其采样值保存到缓冲区中。

AVSampleFormat 表示采样格式。如上面所述，无24位整数的采样格式。

enum AVSampleFormat {
    AV_SAMPLE_FMT_NONE = -1,
    AV_SAMPLE_FMT_U8,          ///< unsigned 8 bits
    AV_SAMPLE_FMT_S16,         ///< signed 16 bits
    AV_SAMPLE_FMT_S32,         ///< signed 32 bits
    AV_SAMPLE_FMT_FLT,         ///< float
    AV_SAMPLE_FMT_DBL,         ///< double

    AV_SAMPLE_FMT_U8P,         ///< unsigned 8 bits, planar
    AV_SAMPLE_FMT_S16P,        ///< signed 16 bits, planar
    AV_SAMPLE_FMT_S32P,        ///< signed 32 bits, planar
    AV_SAMPLE_FMT_FLTP,        ///< float, planar
    AV_SAMPLE_FMT_DBLP,        ///< double, planar
    AV_SAMPLE_FMT_S64,         ///< signed 64 bits
    AV_SAMPLE_FMT_S64P,        ///< signed 64 bits, planar

    AV_SAMPLE_FMT_NB           ///< Number of sample formats. DO NOT USE if linking dynamically
};

以下是对采样格式的一些常见函数：

//获取采样格式的字符串名称
const char *av_get_sample_fmt_name(enum AVSampleFormat sample_fmt);

//根据字符串名称获取采样格式
enum AVSampleFormat av_get_sample_fmt(const char *name);

//获取参数指定的采样格式可替换的采样格式，即planar <-> packed互换
//参数planar指定我们要获取的采样类型
//如果参数就是我们想要的，那么原样返回
enum AVSampleFormat av_get_alt_sample_fmt(enum AVSampleFormat sample_fmt, int planar);
enum AVSampleFormat av_get_packed_sample_fmt(enum AVSampleFormat sample_fmt);
enum AVSampleFormat av_get_planar_sample_fmt(enum AVSampleFormat sample_fmt);

//返回采样格式的描述，即“名称  位深”
//如果sample_fmt是一个负数，则返回表头一样的东西：”name depth“
char *av_get_sample_fmt_string(char *buf, int buf_size, enum AVSampleFormat sample_fmt);

//返回采样格式一次采样的字节数 
int av_get_bytes_per_sample(enum AVSampleFormat sample_fmt);

//查看采样格式是planar还是packed
int av_sample_fmt_is_planar(enum AVSampleFormat sample_fmt);

//返回保存音频数据所需的字节数
//linesize用于保存返回的值
//成功返回0，失败返回负的错误值
int av_samples_get_buffer_size(int *linesize, int nb_channels, int nb_samples,enum AVSampleFormat sample_fmt, int align);

我们知道，AVFrame既可以保存视频数据又可以保存音频数据。

而为了保存原始的音频数据和视频数据，FFmpeg也提供了其他的数据结构。

如表示一张图片的AVPicture:

typedef struct AVPicture {
    attribute_deprecated
    uint8_t *data[AV_NUM_DATA_POINTERS];    ///< pointers to the image data planes
    attribute_deprecated
    int linesize[AV_NUM_DATA_POINTERS];     ///< number of bytes per line
} AVPicture;

又如，av_image 一系列函数所操作的对象，就是 uint8_t *data[4] 和 int linesize[4]。

而在这里保存音频的结构也是类似的，即多块缓冲区，和一个表示这些缓冲区大小的Int值。

以上的不同之处在于，视频只要4块缓冲区就绝对够用了，而音频则不一定。还有一点就是视频不同缓存区，或者说不同数据平面的大小可能是不相等的，而音频各个数据平面的大小必定是一样的，因此有一个表达平面数据大小的int值就可以了。

这就是AVFrame，用于保存视频的缓冲区，用于保存音频的缓冲区之间的相同和不同点。

有了以上的知识，我们就可以轻易理解下面的函数了。

//将buf中的采样数据，填充到音频缓冲区audio_data和linesize中
//buf中的数据长度至少是 av_samples_get_buffer_size大小
int av_samples_fill_arrays(uint8_t **audio_data, int *linesize,
                           const uint8_t *buf,
                           int nb_channels, int nb_samples,
                           enum AVSampleFormat sample_fmt, 
                           int  align);

//分配存储音频采样数据的缓冲区
//保存缓存区指针的数组是用户分配的
//audio_data和linesize为输出数据
int av_samples_alloc(uint8_t **audio_data, int *linesize,
                     int nb_channels,
                     int nb_samples, 
                     enum AVSampleFormat sample_fmt, int align);

//分配存储采样数据的缓冲区
//不同的是这个函数会将保存数据的指针数组也分配
int av_samples_alloc_array_and_samples(uint8_t ***audio_data, 
                                       int *linesize, 
                                       int nb_channels,
                                       int nb_samples, 
                                       enum AVSampleFormat 
                                       sample_fmt,
                                       int align);

//拷贝采样数据
int av_samples_copy(uint8_t **dst, uint8_t * const *src,
                    int dst_offset,
                    int src_offset, int nb_samples, 
                    int nb_channels,
                    enum AVSampleFormat sample_fmt);

//以静音数据填充缓冲区
int av_samples_set_silence(uint8_t **audio_data, int offset,
                           int nb_samples,
                           int nb_channels,
                           enum AVSampleFormat sample_fmt);