FFmpeg学习—ffmpeg 利用 swr_convert 函数将AV_SAMPLE_FMT_S16 转 AV_SAMPLE_FMT_FLTP

最新推荐文章于 2025-06-13 00:39:03 发布

置顶代号95+27

最新推荐文章于 2025-06-13 00:39:03 发布

阅读量1.2w

点赞数 5

CC 4.0 BY-SA版权

分类专栏： FFmpeg Android 文章标签： FFmpeg AV_SAMPLE_FMT_S16 AV_SAMPLE_FMT_FLTP

本文链接：https://blog.youkuaiyun.com/XIAIBIANCHENG/article/details/72810495

FFmpeg Android 专栏收录该内容

12 篇文章

订阅专栏

在Android 平台下利用AudioRecord 录制音频数据时基于 ENCODING_PCM_16BIT 进行采样，然后在利用ffmpeg 进行编码成aac格式的音频文件，由于最新ffmpeg 库的sample_fmt必须以AV_SAMPLE_FMT_FLTP这种方式进行存储，而ENCODING_PCM_16BIT 是AV_SAMPLE_FMT_S16格式的。如果是单声道的话两者区别不大，都可以存在AVFrame->data[0] 里面，只是AV_SAMPLE_FMT_FLTP是以浮点数的方式存储，后者是以有符号整形16位存贮的。如果是双声道的话二者的区别就比较大了。AV_SAMPLE_FMT_S16的存贮方式如下图:

AV_SAMPLE_FMT_FLTP的存储方式如下图:

AV_SAMPLE_FMT_S16是非平坦的，左右声道以LRLR....的方式连续的存在一个数组里面，AV_SAMPLE_FMT_FLTP是平坦的存贮方式，左右声道是分开存在两个数组里面的。

如何将Android 以 CHANNEL_IN_STEREO(双声道) ENCODING_PCM_16BIT (AV_SAMPLE_FMT_S16) 录制方式用ffmpeg 用AV_SAMPLE_FMT_FLTP编码为aac音频文件呢?其实ffmepg 已经提供了这样的接口，就是swr_convert 函数，转换方式如下:

在android 里面初始化AudioRecord:

 minBufferSize = 4096;
 audioRecord = new AudioRecord(MediaRecorder.AudioSource.MIC, sampleRate, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT, minBufferSize);

其中minBufferSize 要设置为4096,因为ffmpeg 的frame_size 默认为1024,所以一次采样次数至少为1024次，由于一次采集两个声道，每个声道占两个字节，采集1024次就是1024*2*2=4096个字节，我开始做的时候没注意这个就一直不对。

swr_convert 函数的定义如下:

int swr_convert(struct SwrContext *s, 
				uint8_t **out, 
				int out_count,
                const uint8_t **in , 
				int in_count);

第一个参数是SwrContext 结构体，第二个参数是输出数据保存的地方，第三个参数是out的大小，第四个参数是输入数据就是要待转换的数据，第五个参数是输入数据采样的个数。

SwrContext 的初始化:

SwrContext      *swr;

swr = swr_alloc();
av_opt_set_int(swr, "in_channel_layout",  AV_CH_LAYOUT_STEREO, 0);
    av_opt_set_int(swr, "out_channel_layout", AV_CH_LAYOUT_STEREO,  0);
    av_opt_set_int(swr, "in_sample_rate",     16000, 0);
    av_opt_set_int(swr, "out_sample_rate",    16000, 0);
    av_opt_set_sample_fmt(swr, "in_sample_fmt",  AV_SAMPLE_FMT_S16, 0);
    av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_FLTP,  0);
    swr_init(swr);

定义输出数据保存的数组:

 uint8_t *outs[2];
    outs[0]=(uint8_t *)malloc(len);//len 为4096
    outs[1]=(uint8_t *)malloc(len);

进行转换

uint8_t* srcBuf = (uint8_t*) (*env)->GetByteArrayElements(env, pcmData, 0);//pcmData是android 传下来的的pcm 数据，为4096个字节，1024次采样。
int count=swr_convert(swr,&outs,len*4,&srcBuf,len/4);//len 为4096
audioFrame->data[0] =(uint8_t*)outs[0];//audioFrame 是VFrame
 audioFrame->data[1] =(uint8_t*)outs[1];

如果是单声道转的话

AudioRecord 的初始化如下:

minBufferSize = 2048;//1024*2
 audioRecord = new AudioRecord(MediaRecorder.AudioSource.MIC, sampleRate, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT, minBufferSize);

SwrContext 的初始化:

SwrContext      *swr;

swr = swr_alloc();
av_opt_set_int(swr, "in_channel_layout",  AV_CH_LAYOUT_MONO, 0);
    av_opt_set_int(swr, "out_channel_layout", AV_CH_LAYOUT_MONO,  0);
    av_opt_set_int(swr, "in_sample_rate",     16000, 0);
    av_opt_set_int(swr, "out_sample_rate",    16000, 0);
    av_opt_set_sample_fmt(swr, "in_sample_fmt",  AV_SAMPLE_FMT_S16, 0);
    av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_FLTP,  0);
    swr_init(swr);

定义输出数据保存的数组:

 uint8_t *outs[1];
    outs[0]=(uint8_t *)malloc(len*2);//len 为2048

进行转换

uint8_t* srcBuf = (uint8_t*) (*env)->GetByteArrayElements(env, pcmData, 0);//pcmData是android 传下来的的pcm 数据，为2048个字节，1024次采样。
int count=swr_convert(swr,&outs,len*2,&srcBuf,len/2);//len 为2048
audioFrame->data[0] =(uint8_t*)outs[0];//audioFrame 是VFrame

转换例子

https://github.com/XIAIBIANCHENG/OggRecord/tree/master