FLV
flv文件格式从整体上看是由flv header 以及flv file body组成。
| flv header |
|---|
| PreviousTagSize |
| Tag 1 |
| PreviousTagSize |
| … |
| Tag n |
| PreviousTagSize |
flv header
| Field | Type | Comment |
|---|---|---|
| Signature | UI8 | Signature byte always ‘F’ (0x46) |
| Signature | UI8 | Signature byte always ‘L’ (0x4C) |
| Signature | UI8 | Signature byte always ‘V’ (0x56) |
| Version | UI8 | File version (for example, 0x01 for FLV version 1) |
| TypeFlagsReserved | UB[5] | Shall be 0 |
| TypeFlagsAudio | UB[1] | 1 = Audio tags are present |
| TypeFlagsReserved | UB[1] | Shall be 0 |
| TypeFlagsVideo | UB[1] | 1 = Video tags are present |
| DataOffset | UI32 | The length of this header in bytes |
flv header通常为9个字节,其中前三个字节为文件标志位,分别为’F’、‘L’、‘V’,version字段表示版本号,第五个字节TypeFlags中低0位和低2位分别标志此flv文件video和audio是否存在,最后4个字节表示flv header的长度,version=1时固定为9,其他版本可能存在扩展信息。
flv file body
| Field | Type | Comment |
|---|---|---|
| PreviousTagSize0 | UI32 | Always 0 |
| Tag1 | FLVTAG | First tag |
| PreviousTagSize1 | UI32 | Size of previous tag, including its header, in bytes. |
| Tag2 | FLVTAG | Second tag |
| … | … | … |
| PreviousTagSizeN | Size of last tag, including its header, in bytes. |
flv file body是由一系列的tag和previous_tag_size组成,每个tagsize均固定4个字节长度,表示前一个tag长度,在flv version 1中其值应等于固定的tag header长度11个字节 + tag_header_datasize字段的长度。此外,紧跟在flv header后的previous_tag_size的值总是0。
flv tag
universal flv tag
| Field | Type | Comment |
|---|---|---|
| Reserved | UB[2] | Reserved for FMS, should be 0 |
| Filter | UB[1] | Indicates if packets are filtered. Shall be 0 in unencrypted files, and 1 for encrypted tags. |
| TagType | UB[5] | Type of contents in this tag. The following types are defined: 8 = audio 9 = video 18 = script data |
| DataSize | UI24 | Length of the message. Number of bytes after StreamID to end of tag |
| Timestamp | UI24 | Time in milliseconds at which the data in this tag applies. |
| TimestampExtended | UI8 | Extension of the Timestamp field to form a SI32 value. upper 8 bits |
| StreamID | UI24 | Always 0. |
| Data | Data specific for each media type. |
FMS: Adobe Flash Media Server
tag的一般性定义,根据TagType可以将tag分为三种类型,flv所有的时间信息都应保存在tag header中的时间戳及扩展时间戳,而忽略任何负载中的计时机制,同时通常flv保存的是相对时间戳,相对于第一个tag的时间偏移。
audio data
| Field | Type | Comment |
|---|---|---|
| SoundFormat | UB[4] | Format of SoundData. |
| SoundRate | UB[2] | Sampling rate. |
| SoundSize | UB[1] | Size of each audio sample. |
| SoundType | UB[1] | Mono or stereo sound |
| *AACPacketType | IF SoundFormat == 10 UI8 | 0 = AAC sequence header 1 = AAC raw |
| AudioPayLoad | Varies by format | Effective voice information |
SoundFormat type: 0 = Linear PCM, platform endian
1 = ADPCM
2 = MP3
3 = Linear PCM, little endian
4 = Nellymoser 16 kHz mono
5 = Nellymoser 8 kHz mono
6 = Nellymoser
7 = G.711 A-law logarithmic PCM
8 = G.711 mu-law logarithmic PCM
9 = reserved
10 = AAC
11 = Speex
14 = MP3 8 kHz
15 = Device-specific sound
音频编码格式,其中最为常用的为10 = AAC,此时tag_data的第二个字节用来表示AACPacketType,用以区
AAC sequence header和AAC raw。AACPacketType == 0时,AudioPayLoad为AudioSpecificConfig;
AACPacketType == 1时,AudioPayLoad为Raw AAC frame data in UI8,一个tag中包含多个 AAV Frame。
SoundRate type : 0 = 5.5 kHz
1 = 11 kHz
2 = 22 kHz
3 = 44 kHz
音频采样率,通常在保证音质的前提下,采样率应为某段音频最高频率的2倍以上。在实际文件中,有时此字段并不代表负载的真实采样率。
SoundSize type : 0 = 8-bit samples 1 = 16-bit samples
SoundType type : 0 = Mono sound 1 = Stereo sound
音频采样值,量化音频信号的区间大小,16bits采样值会比8bits采样值更加细腻,更好的保留声音信息。
video data
| Field | Type | Comment |
|---|---|---|
| Frame Type | UB [4] | Type of video frame. |
| CodecID | UB [4] | Codec Identifier. |
| *AVCPacketType | IF CodecID == 7 UI8 | 0 = AVC sequence header 1 = AVC NALU 2 = AVC end of sequence |
| *CompositionTime | IF CodecID == 7 SI24 | IF AVCPacketType ==1 Composition time offset , else 0 |
| VideoPayLoad | Varies by format | Effective video information |
Frame Type : 1 = key frame (for AVC, a seekable frame)
2 = inter frame (for AVC, a non-seekable frame)
3 = disposable inter frame (H.263 only)
4 = generated key frame (reserved for server use only)
5 = video info/command frame
帧的数据类型,1为关键帧,2为非关键帧,3为h.263非关键帧,4为服务器生成的关键帧,5为视频信息及命令帧
CodecID: 1 = JPEG
2 = Sorenson H.263
3 = Screen video
4 = On2 VP6
5 = On2 VP6 with alpha channel
6 = Screen video version 2
7 = AVC
帧的编码格式,其中最为常用的是7 = AVC ,此时tag_data的第二个字节用来表示AVCPacketType,用以区分AVC sequence heade和AVC NALU,第三到五个字节用来表示CompositionTime,其是PTS相对于DTS的时间偏移。用于存在B帧的情况下,由DTS(flv video tag timestamp)计算出PTS进而进行视频播放。当AVCPacketType=0时,VideoPayLoad为AVCDecoderConfigurationRecord,当AVCPacketType=1时,VideoPayLoad为One or more NALUs (Full frames are required)。
script data
此处不讨论加密模式。通常script data包含两个AMF包,一个name,一个value。
| Field | Type | Comment |
|---|---|---|
| Name | SCRIPTDATAVALUE | Method or object name. Type = 2 (String) |
| Value | SCRIPTDATAVALUE | AMF arguments or object properties. Type = 8 (ECMA array) |
SCRIPTDATAVALUE的数据组织方式,首先是一个字节的Type,然后根据不同Type,含有不同长度的值。
0 = Number DOUBLE 以小端序组织的8字节double类型。
1 = Boolean UI8 1字节长度,0代表false,1代表true。
2 = String SCRIPTDATASTRING 2字节长度表示字符串长度len,其后的len字节表示字符串(ASCII码)
3 = Object SCRIPTDATAOBJECT 嵌套结构,一系列的{SCRIPTDATASTRING, SCRIPTDATAVALUE}数
据,并以0x00,0x00,0x09为结束码。
7 = Reference UI16 2字节无符号整数
8 = ECMAarray SCRIPTDATAECMAARRAY 同 3 = Object,但最前4个字节表示数组元素数量。
10=Strict array SCRIPTDATASTRICTARRAY 前4个字节表示数组元素数量,紧接着若干个SCRIPTDATAVALUE
11=Date SCRIPTDATADATE DOUBLE表示时间,2字节的有符号整数SI16表示时区偏移
12=Long string SCRIPTDATALONGSTRING 4字节长度表示字符串长度len,其后的len字节表示字符串(ASCII码)
FLV 元数据对象应携带在名为 onMetadata 的 SCRIPTDATA 标签中,可供ActionScript 程序调用其各种属性。
onMetadata
| Name | Type | Comment |
|---|---|---|
| duration | Number | 视频时长(s) |
| width | Number | 以像素为单位的视频宽度 |
| height | Number | 以像素为单位的视频高度 |
| videodatarate | Number | 视频码率 |
| framerate | Number | 帧率 |
| videocodecid | Number | 视频解码器ID |
| audiodatarate | Number | 音频码率 |
| audiosamplerate | Number | 音频流的采样率 |
| audiosamplesize | Number | 音频采样值大小(量化大小) |
| stereo | Boolean | 指示立体声音频 |
| audiocodecid | Number | 音频解码器ID |
| major_brand | String | 品牌 |
| minor_version | String | 版本 |
| compatible_brands | String | 兼容牌品 |
| encoder | String | 编码器 |
| filesize | Number | 文件大小(byte) |
| canSeekToEnd | Boolean | 最后一个视频帧是关键帧 |
| keyframes | Strict array | 记录关键帧{文件偏移, 时间戳} |
本文详细解读了FLV文件结构,包括flvheader的构成,如版本信息、音频/视频标志位以及tag的组成,如音频数据的格式、视频帧类型和scriptdata的内容。重点介绍了时间戳处理和关键帧的使用。
1093

被折叠的 条评论
为什么被折叠?



