音视频开发17 FFmpeg 音频解码- 将 aac 解码成 pcm

hunandede

已于 2024-07-16 11:43:42 修改

阅读量1.1k

点赞数 6

CC 4.0 BY-SA版权

分类专栏： ffmpeg基础知识文章标签：音视频 ffmpeg aac

于 2024-06-03 19:48:40 首次发布

本文链接：https://blog.youkuaiyun.com/hunandede/article/details/139410813

这一节，接音视频开发12 FFmpeg 解复用详情分析，前面我们已经对一个 MP4文件，或者 FLV文件，或者TS文件进行了解复用，解出来的视频是H264,音频是AAC，那么接下来就要对H264和AAC进行处理，这一节主要是对 AAC进行处理。

⾳频解码过程

FFmpeg流程解码过程

关键函数说明

1.找到想要的编解码器

const AVCodec *avcodec_find_decoder(enum AVCodecID id);

根据AVCodecID 查找对应的AVCodec

根据AVCodecID 查找对应的AVCodec
/**
 * Find a registered decoder with a matching codec ID.
 *
 * @param id AVCodecID of the requested decoder
 * @return A decoder if one was found, NULL otherwise.
 */
const AVCodec *avcodec_find_decoder(enum AVCodecID id);


这个 AVCodecID 代表的是 解码器 或者 编码器 的 ID

enum AVCodecID {
    AV_CODEC_ID_NONE,
    ......
    
    //video codecs
    AV_CODEC_ID_H264,
    ......
    
    //audio codecs
    AV_CODEC_ID_MP3,
    AV_CODEC_ID_AAC,
    ......
    
    //pcm codecs
    AV_CODEC_ID_PCM_U16LE
    
    //subtitle codecs
    AV_CODEC_ID_DVD_SUBTITLE
    
    }

但是这里有一个问题，就是我们一般在解析一个文件的时候，并不知道这个文件的音频和视频用的什么编码，也就不知道用什么解码器解码比较好，合理的写法有两种，如下：

第一种，在前面解封装的时候，通过 avformat_find_stream_info 方法我们得到过文件的详细信息：然后通过 avformatcontext 得到每一个AVStream，通过AVStream就可以得到codecid,然后就可以得到AVCodec。

但是这里无法分清楚那个是音频，哪个是视频，还需要进一步的判断：


    for (i = 0; i < ifmt_ctx->nb_streams; i++) {
        AVStream *stream = avformatcontext->streams[i];
        const AVCodec *dec = avcodec_find_decoder(stream->codecpar->codec_id);
        
        。。。。。。
    }

另一种方式：使用 av_find_best_stream 函数获得指定的 avformatcontext中的最佳的stream。这时候通过传递进去一个 AVCodec，方法完成后就能得到对应的AVCodec

注意你要得到的解码器 avcodec，是通过指针的形式传递进去的。

int av_find_best_stream(AVFormatContext *avformatcontext,
                        enum AVMediaType type,
                        int wanted_stream_nb,
                        int related_stream,
                        const struct AVCodec **decoder_ret,
                        int flags);

参数说明
ic：AVFormatContext指针，表示输入的媒体文件上下文。
type：要查找的媒体流类型，可以是音频流、视频流或字幕流等。
wanted_stream_nb：期望的媒体流索引号，可以是特定的索引号，也可以是AV_NOPTS_VALUE（-1）表示任意流。
related_stream：前一个相关流的索引号，如果没有前一个相关流，则传入-1。
decoder_ret：返回解码器指针。
flags：查找最佳流的标志位，默认为0。
返回值：
找到的最佳匹配媒体流的索引号，如果找不到则返回AVERROR_STREAM_NOT_FOUND。


 * @return  the non-negative stream number in case of success,
 *          AVERROR_STREAM_NOT_FOUND if no stream with the requested type
 *          could be found,
 *          AVERROR_DECODER_NOT_FOUND if streams were found but no decoder
 *
 * @note  If av_find_best_stream returns successfully and decoder_ret is not
 *        NULL, then *decoder_ret is guaranteed to be set to a valid AVCodec.

•avcodec_find_decoder_by_name():根据解码器名字找到解码器，这里有一个问题，这个name从哪里得到呢？

在windows cmd 下，输入 ffmpeg -h，就可以看到

Print help / information / capabilities:
-L                  show license
-h <topic>          show help
-version            show version
-muxers             show available muxers
-demuxers           show available demuxers
-devices            show available devices
-decoders           show available decoders
-encoders           show available encoders
-filters            show available filters
-pix_fmts           show available pixel formats
-layouts            show standard channel layouts
-sample_fmts        show available audio sample formats

我们是要找解码器的，因此 ffmpeg -decoders 就可以将所有的解码器列出来，为了方便查找，还可以将存储到一个txt 中

ffmpeg -decoders > a.txt

在a.txt中看当前ffmpeg 支持的 decoder 的name有哪些，对应的如下的012v，4xm就是video的解码器名字，也可以当前查找关键字，例如aac，h264 就更快一些。

Decoders:
 V..... = Video
 A..... = Audio
 S..... = Subtitle
 .F.... = Frame-level multithreading
 ..S... = Slice-level multithreading
 ...X.. = Codec is experimental
 ....B. = Supports draw_horiz_band
 .....D = Supports direct rendering method 1
 ------
 V....D 012v                 Uncompressed 4:2:2 10-bit
 V....D 4xm                  4X Movie
 V....D 8bps                 QuickTime 8BPS video
...................
 A....D aac                  AAC (Advanced Audio Coding)
 A....D aac_fixed            AAC (Advanced Audio Coding) (codec aac)
 A....D libfdk_aac           Fraunhofer FDK AAC (codec aac)
 A....D aac_latm             AAC LATM (Advanced Audio Coding LATM syntax)
.................

 V....D h261                 H.261
 V...BD h263                 H.263 / H.263-1996, H.263+ / H.263-1998 / H.263 version 2
 V...BD h263i                Intel H.263
 V...BD h263p                H.263 / H.263-1996, H.263+ / H.263-1998 / H.263 version 2
 VFS..D h264                 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
 VFS..D hap                  Vidvox Hap
 VF...D hdr                  HDR (Radiance RGBE format) image

/**
 * Find a registered decoder with the specified name.
 *
 * @param name name of the requested decoder
 * @return A decoder if one was found, NULL otherwise.
 */
const AVCodec *avcodec_find_decoder_by_name(const char *name);

到这里，我们就有了解码器了（AVCodec），有了解码器还不行，还需要有解码器上下文，这里谈一下为什么有了解码器还需要有解码器上下文。

假设有一个视频文件，里面有3路视频，3路音频，有两路视频都是H264的，如果数据都保存到解码器里面，多路解码的时候，数据会有冲突，因此要多设计一个AVCodecContext.

也就是说，在ffmpeg 中，AVCodec 中一般存储的是方法，AVCodecContext 中则存储了该AVCodec中的具体数据。实际上 ffmpeg 中一直就延续这种风格，xxxcontext中存储的都是对应的xxx的数据，而 xxx中则是对应的方法之类的。

我们具体的来看：struct AVCodec

我们观察AVCodec，看到AVCodec 中的内容，都是
该 AVCodec支持的supported_framerates 数组。
该 AVCodec支持的 enum AVPixelFormat *pix_fmt 数组。
该 AVCodec支持的 supported_samplerates 数组。
该 AVCodec支持的 AVSampleFormat 数组。



typedef struct AVCodec {
    /**
     * Name of the codec implementation.
     * The name is globally unique among encoders and among decoders (but an
     * encoder and a decoder can share the same name).
     * This is the primary way to find a codec from the user perspective.
     */
    const char *name;
    /**
     * Descriptive name for the codec, meant to be more human readable than name.
     * You should use the NULL_IF_CONFIG_SMALL() macro to define it.
     */
    const char *long_name;
    enum AVMediaType type;
    enum AVCodecID id;
    /**
     * Codec capabilities.
     * see AV_CODEC_CAP_*
     */
    int capabilities;
    uint8_t max_lowres;                     ///< maximum value for lowres supported by the decoder
    const AVRational *supported_framerates; ///< array of supported framerates, or NULL if any, array is terminated by {0,0}
    const enum AVPixelFormat *pix_fmts;     ///< array of supported pixel formats, or NULL if unknown, array is terminated by -1
    const int *supported_samplerates;       ///< array of supported audio samplerates, or NULL if unknown, array is terminated by 0
    const enum AVSampleFormat *sample_fmts; ///< array of supported sample formats, or NULL if unknown, array is terminated by -1
    const AVClass *priv_class;              ///< AVClass for the private context
    const AVProfile *profiles;              ///< array of recognized profiles, or NULL if unknown, array is terminated by {AV_PROFILE_UNKNOWN}

    /**
     * Group name of the codec implementation.
     * This is a short symbolic name of the wrapper backing this codec. A
     * wrapper uses some kind of external implementation for the codec, such
     * as an external library, or a codec implementation provided by the OS or
     * the hardware.
     * If this field is NULL, this is a builtin, libavcodec native codec.
     * If non-NULL, this will be the suffix in AVCodec.name in most cases
     * (usually AVCodec.name will be of the form "<codec_name>_<wrapper_name>").
     */
    const char *wrapper_name;

    /**
     * Array of supported channel layouts, terminated with a zeroed layout.
     */
    const AVChannelLayout *ch_layouts;
} AVCodec;

再来看一下 AVCodecContext 的内容。里面存储了当前的avcodec的具体的数据，我们的这个

AVCodecContext 内容太多了。这里如果要看，直接看源码比较好

2. 给解码器分配解码器上下文，并初始化一些default value，注意这时候解码器上下文还是没有值

我们在这里debug 一下，看这时候 AVCodecContext 里面的内容是啥？

AVCodecContext *avcodec_alloc_context3(const AVCodec *codec);

/**
 * Allocate an AVCodecContext and set its fields to default values. The
 * resulting struct should be freed with avcodec_free_context().
 *
 * @param codec if non-NULL, allocate private data and