MP4文件格式详解(ISO-14496-12/14)
Author:Pirate Leo
Email:codeevoship@gmail.com
一、基本概念
1、文件,由许多Box和FullBox组成。
2、Box,每个Box由Header和Data组成。
3、FullBox,是Box的扩展,Box结构的基础上在Header中增加8bits version和24bits flags。
4、Header,包含了整个Box的长度size和类型type。当size==0时,代表这是文件中最后一个Box;当size==1时,意味着Box长度需要更多bits来描述,在后面会定义一个64bits的largesize描述Box的长度;当type是uuid时,代表Box中的数据是用户自定义扩展类型。
5、Data,是Box的实际数据,可以是纯数据也可以是更多的子Boxes。
6、当一个Box的Data中是一系列子Box时,这个Box又可成为Container Box。
Box的结构用伪代码表示如下:
aligned(8) class Box (unsigned int(32) boxtype,optional unsigned int(8)[16] extended_type)
{
unsigned int(32) size;
unsigned int(32) type = boxtype;
if (size==1)
{
unsigned int(64) largesize;
}
else if (size==0)
{
// box extends to end of file
}
if (boxtype==‘uuid’)
{
unsigned int(8)[16] usertype = extended_type;
}
结构如下图:

文件基本结构描述图
二、MP4文件格式(ISO-14496-12/14)
MP4文件概述
MP4文件就是由各式各样的Box组成的,下表中列出了所有必选或可选的Box类型,√代表Box必选。

具体列表:
| ftyp | √ | file type and compatibility | |||||
| pdin | progressive download information | ||||||
| moov | √ | container for all the metadata | |||||
| mvhd | √ | movie header, overall declarations | |||||
| trak | √ | container for an individual track or stream | |||||
| tkhd | √ | track header, overall information about the track | |||||
| tref | track reference container | ||||||
| edts | edit list container | ||||||
| elst | an edit list | ||||||
| mdia | √ | container for the media information in a track | |||||
| mdhd | √ | media header, overall information about the media | |||||
| hdlr | √ | handler, declares the media (handler) type | |||||
| minf | √ | media information container | |||||
| vmhd | video media header, overall information (video track only) | ||||||
| smhd | sound media header, overall information (sound track only) | ||||||
| hmhd | hint media header, overall information (hint track only) | ||||||
| nmhd | Null media header, overall information (some tracks only) | ||||||
| dinf | √ | data information box, container | |||||
| dref | √ | data reference box, declares source(s) of media data in track | |||||
| stbl | √ | sample table box, container for the time/space map | |||||
| stsd | √ | sample descriptions (codec types, initialization etc.) | |||||
| stts | √ | (decoding) time-to-sample | |||||
| ctts | (composition) time to sample | ||||||
| stsc | √ | sample-to-chunk, partial data-offset information | |||||
| stsz | sample sizes (framing) | ||||||
| stz2 | compact sample sizes (framing) | ||||||
| stco | √ | chunk offset, partial data-offset information | |||||
| co64 | 64-bit chunk offset | ||||||
| stss | sync sample table (random access points) | ||||||
| stsh | shadow sync sample table | ||||||
| padb | sample padding bits | ||||||
| stdp | sample degradation priority | ||||||
| sdtp | independent and disposable samples | ||||||
| sbgp | sample-to-group | ||||||
| sgpd | sample group description | ||||||
| subs | sub-sample information | ||||||
| mvex | movie extends box | ||||||
| mehd | movie extends header box | ||||||
| trex | √ | track extends defaults | |||||
| ipmc | IPMP Control Box | ||||||
| moof | movie fragment | ||||||
| mfhd | √ | movie fragment header | |||||
| traf | track fragment | ||||||
| tfhd | √ | track fragment header | |||||
| trun | track fragment run | ||||||
| sdtp | independent and disposable samples | ||||||
| sbgp | sample-to-group | ||||||
| subs | sub-sample information | ||||||
| mfra | movie fragment random access | ||||||
| tfra | track fragment random access | ||||||
| mfro | √ | movie fragment random access offset | |||||
| mdat | media data container | ||||||
| free | free space | ||||||
| skip | free space | ||||||
| udta | user-data | ||||||
| cprt | copyright etc. | ||||||
| meta | metadata | ||||||
| hdlr | √ | handler, declares the metadata (handler) type | |||||
| dinf | data information box, container | ||||||
| dref | data reference box, declares source(s) of metadata items | ||||||
| ipmc | IPMP Control Box | ||||||
| iloc | item location | ||||||
| ipro | item protection | ||||||
| sinf | protection scheme information box | ||||||
| frma | original format box | ||||||
| imif | IPMP Information box | ||||||
| schm | scheme type box | ||||||
| schi | scheme information box | ||||||
| iinf | item information | ||||||
| xml | XML container | ||||||
| bxml | binary XML container | ||||||
| pitm | primary item reference | ||||||
| fiin | file delivery item information | ||||||
| paen | partition entry | ||||||
| fpar | file partition | ||||||
| fecr | FEC reservoir | ||||||
| segr | file delivery session group | ||||||
| gitn | group id to name | ||||||
| tsel | track selection | ||||||
| meco | additional metadata container | ||||||
| mere | metabox relation |
正式开始前先对文件的几个重要部分宏观介绍一下,以便诸位在后续学习时心中有数:
1、ftypbox,在文件的开始位置,描述的文件的版本、兼容协议等;2、moovbox,这个box中不包含具体媒体数据,但包含本文件中所有媒体数据的宏观描述信息,moov box下有mvhd和trak box。
>>mvhd中记录了创建时间、修改时间、时间度量标尺、可播放时长等信息。
>>trak中的一系列子box描述了每个媒体轨道的具体信息。
3、moofbox,这个box是视频分片的描述信息。并不是MP4文件必须的部分,但在我们常见的可在线播放的MP4格式文件中(例如Silverlight Smooth Streaming中的ismv文件)确是重中之重。
4、mdatbox,实际媒体数据。我们最终解码播放的数据都在这里面。
5、mfrabox,一般在文件末尾,媒体的索引文件,可通过查询直接定位所需时间点的媒体数据。

附:Smooth Streaming中ismv文件结构,文件分为了多个Fragments,每个Fragment中包含moof和mdat。这样的结构符合渐进式播放需求。(mdat及其描述信息逐步传输,收齐一个Fragment便可播放其中的mdat)。
本文详细阐述了MP4文件的组成结构,包括文件的基本概念、Box类型及其作用,以及MP4文件格式的概述,帮助读者理解文件的内部运作。
9686

被折叠的 条评论
为什么被折叠?



