MPEG2代码分析Part2 编码前的初始化与序列头的格式

本文围绕视频编码的速率控制展开,介绍了Reaction Parameter、average activity和global complexity的作用。还详细说明了序列头的写入过程,包括序列开始码、水平垂直长度等内容,以及extension码流部分、显示扩展属性和用户数据的写入,最后提及开始对源数据流帧进行编码。

Rate Control

rc_init_seq()

关于Reaction Parameter:

Reaction Parameter帮助编码器动态调整编码时的QP值。对于图像复杂度小的图像来说,使用更大的reaction parameter,这时会有更小的QP值,这样编码的时候更为细节化,如果reaction parameter小,那么QP值较大,这时编码就会比较粗糙。

关于average activity

对于一个宏块来说,它的activity值为4个8x8块中的activity中的最小值。它本身是用在最近编码块的归一化过程中的。

关于global complexity

global complexity是用来对不同的帧类型(I,P,B)进行全局复杂度计量而设定的一些权重(Xi,Xp,Xb),I帧给的权重最大,B帧的权重最小,

在rc初始化之后,开始写序列头,头格式从头向下:

内容          长度(位)      值
序列开始码    32             0x1B3L
水平长度      12
垂直长度      12
aspect ratio  4
帧率码        4
码率值        18
标志位        1              1
vbv缓冲大小   10
强制参数标志  1

之下,如有intra量化表和非intra量化表的话,把量化矩阵写在码流头中。

如非mpeg1码流的话,写入extension码流部分

内容          长度(位)      值
EXT开始码     32             0x1B5L
SEQ_ID        4              1
profile&level 8
prog序列标识  1
chroma格式    2
水平大小扩展  2
垂直大小扩展  2
码率扩展      12
标记位        1
vbv缓冲扩展   8
low_delay     1
帧率扩展_n    2
帧率扩展_d    5

写入显示扩展属性
内容          长度(位)      值
EXT开始码     32             0x1B5L
DISP_ID       4              2
视频格式      3
色描述        1
colourprimaries  8
传输特性      8
矩阵系数      8
显示水平大小  14
标记位        1
显示垂直大小  14

写入用户数据
内容          长度(位)      值
USER_START_CODE  32          0x1B2L
写出用户ID串

至此序列头写出完毕。

之后开始对源数据流中的所有的帧开始编码。

编码阶段首先取得当前GOP中的最小的帧号,初始在par文件中指定一个GOP中的帧数N和I与P帧之间的距离M.那么当前GOP中的最小帧号为 N*((curr_frame_no+(M-1))/N)-(M-1).

之后的过程下回来说.

在linux下 FAACEncoder.java如下: package com.comname.video.common.coder; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.util.Arrays; import java.util.List; import javax.sound.sampled.AudioFormat; import javax.sound.sampled.AudioInputStream; import javax.sound.sampled.AudioSystem; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import com.comname.video.common.AvAttr; import com.comname.video.common.Logs; import com.comname.video.common.util.ByteUtil; import com.comname.video.common.util.Os; import com.sun.jna.Library; import com.sun.jna.Memory; import com.sun.jna.Native; import com.sun.jna.Pointer; import com.sun.jna.Structure; import com.sun.jna.Union; import com.sun.jna.ptr.IntByReference; public class FAACEncoder extends IAudioEncoder { private static final Log log = LogFactory.getLog(FAACEncoder.class); protected static String lib_file; static { if (Os.window) { lib_file = Os.moveLibrary("libfaac_dll.dll"); } else { lib_file = Os.moveLibrary("libfaac.so"); } } private static int[] aac_samples = new int[]{96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 12000, 11025, 8000, 7350};//24000 11025 private static byte[] hisiHead = new byte[]{0x00, 0x01, 0x00, 0x00}; private boolean hisiHeader = false; private String sim; private int aacSampleRate = 16000; private int aacChannels = 1; private int aacSampleSizeInBits = 16; private Pointer aacEncoder; private int aacInputSamples, aacMaxOutputBytes; private Memory accOutput; private int mpegVersion = 2; //2-MPEG2 4-MPEG4 private int aacObjectType = 2; //1-MAIN 2-LOW 3-SSR 4-LTP private int outputFormat = 1; //0=RAW 1=ADTS private PcmResampler resampler; private volatile boolean closed = false; public static FAACEncoder create(String sim, byte[] aac, int sampleRate, int sampleFmt, int channels, boolean hisiHeader, AvAttr attr) { FAACEncoder encoder = new FAACEncoder(); encoder.sim = sim; encoder.hisiHeader = hisiHeader; encoder.aacSampleSizeInBits = new int[] {16, 16, 32}[sampleFmt]; encoder.aacSampleRate = sampleRate; encoder.aacChannels = channels; if (aac != null && aac.length > 8) { //https://www.cnblogs.com/zhangxuan/p/8809245.html String adts = ByteUtil.byteToBit(aac, hisiHeader ? 4 : 0, 4); encoder.outputFormat = adts.startsWith("111111111111") ? 1 : 0; //adts以0xFFF开始 if (encoder.outputFormat == 1) { char ID = adts.charAt(12); //MPEG标识符,0标识MPEG-4,1标识MPEG-2 // encoder.mpegVersion = ID == '1' ? 2 : 4; int profile = Integer.valueOf(adts.substring(16, 18), 2); if (ID == '1') {//MPEG2 profile 0-MAIN 1-LOW 2-SSR 3-LTP encoder.aacObjectType = profile + 1; } else if (profile>=1 && profile<=4) {//MPEG4 profile 0-NULL 1-MAIN 2-LOW 3-SSR 4-LTP 5-..... encoder.aacObjectType = profile; } int sampling_frequency_index = Integer.valueOf(adts.substring(18, 22), 2); encoder.aacSampleRate = aac_samples[sampling_frequency_index]; int channel_configuration = Integer.valueOf(adts.substring(23, 26), 2); if (channel_configuration == 1 || channel_configuration == 2) { encoder.aacChannels = channel_configuration; } if (attr != null) { attr.setAudioSamplingRate(encoder.aacSampleRate); attr.setAudioChannelNum(encoder.aacChannels); } } } return encoder; } @Override public void start() throws Exception { IntByReference inputSamples = new IntByReference(), maxOutputBytes = new IntByReference(); if (Os.window) { this.aacEncoder = FAACLibraryWindow.INSTANCE.faacEncOpen(this.aacSampleRate, this.aacChannels, inputSamples, maxOutputBytes); } else { this.aacEncoder = FAACLibraryLinux.INSTANCE.faacEncOpen(this.aacSampleRate, this.aacChannels, inputSamples, maxOutputBytes); } this.aacInputSamples = inputSamples.getValue(); this.aacMaxOutputBytes = maxOutputBytes.getValue(); this.accOutput = new Memory(this.aacMaxOutputBytes); if (Os.window) { FAACLibraryWindow.FaacEncConfiguration cfg = FAACLibraryWindow.INSTANCE.faacEncGetCurrentConfiguration(this.aacEncoder); cfg.mpegVersion = this.mpegVersion;//2-MPEG2 4-MPEG4 cfg.aacObjectType = this.aacObjectType;//1-MAIN 2-LOW 3-SSR 4-LTP // cfg.useLfe = 1;// 0-NO 1-YES // cfg.useTns = 0;// 0-NO 1-YES // cfg.bitRate = 32000;// // cfg.bandWidth = 32000;//6720 cfg.quantqual = 100;//0 Default=100 LOWER<100 HIGHER>100 cfg.outputFormat = this.outputFormat;//0=RAW 1=ADTS cfg.inputFormat = this.aacSampleSizeInBits == 16 ? 1 : this.aacSampleSizeInBits == 24 ? 2 : this.aacSampleSizeInBits == 32 ? 3 : 4; FAACLibraryWindow.INSTANCE.faacEncSetConfiguration(this.aacEncoder, cfg); } else { FAACLibraryLinux.FaacEncConfiguration cfg = FAACLibraryLinux.INSTANCE.faacEncGetCurrentConfiguration(this.aacEncoder); System.out.println(cfg.union); try { if (cfg.union == null) { cfg.union = new FAACLibraryLinux.FaacUnion(); } } catch (Exception e) { e.printStackTrace(); } cfg.mpegVersion = this.mpegVersion;//2-MPEG2 4-MPEG4 cfg.aacObjectType = this.aacObjectType;//1-MAIN 2-LOW 3-SSR 4-LTP // cfg.useLfe = 1;// 0-NO 1-YES // cfg.useTns = 0;// 0-NO 1-YES // cfg.bitRate = 32000;// // cfg.bandWidth = 32000;//6720 cfg.quantqual = 100;//0 Default=100 LOWER<100 HIGHER>100 cfg.outputFormat = this.outputFormat;//0=RAW 1=ADTS cfg.inputFormat = this.aacSampleSizeInBits == 16 ? 1 : this.aacSampleSizeInBits == 24 ? 2 : this.aacSampleSizeInBits == 32 ? 3 : 4; FAACLibraryLinux.INSTANCE.faacEncSetConfiguration(this.aacEncoder, cfg); } this.pcmLen = (int) Math.ceil((1024 * 2 * 8000.0)/this.aacSampleRate); if (this.aacSampleRate != 8000 || this.aacSampleSizeInBits != 16 || this.aacChannels != 1) { this.resampler = new PcmResampler(this.pcmLen, this.aacChannels, this.aacSampleSizeInBits, this.aacSampleRate); } System.out.println("AAC编码参数 采样率:"+this.aacSampleRate+" 位数:"+this.aacSampleSizeInBits+" 通道数:"+this.aacChannels+" profile(1-MAIN 2-LOW 3-SSR 4-LTP):"+this.aacObjectType+" outputFormat(0-RAW 1-ADTS):"+this.outputFormat+" pcmLen:"+this.pcmLen); Logs.sim(log, this.sim, "AAC编码参数 采样率:"+this.aacSampleRate+" 位数:"+this.aacSampleSizeInBits+" 通道数:"+this.aacChannels+" profile(1-MAIN 2-LOW 3-SSR 4-LTP):"+this.aacObjectType+" outputFormat(0-RAW 1-ADTS):"+this.outputFormat+" pcmLen:"+this.pcmLen); } public synchronized byte[] encode(byte[] pcm) throws Exception { if (this.closed) return null; if (this.resampler != null) { pcm = this.resampler.resample(pcm); } int ret; if (Os.window) { ret = FAACLibraryWindow.INSTANCE.faacEncEncode(this.aacEncoder, pcm, this.aacInputSamples, this.accOutput, this.aacMaxOutputBytes); } else { ret = FAACLibraryLinux.INSTANCE.faacEncEncode(this.aacEncoder, pcm, this.aacInputSamples, this.accOutput, this.aacMaxOutputBytes); } if (ret > 0) { if (hisiHeader) { byte[] aac = new byte[4 + ret]; System.arraycopy(hisiHead, 0, aac, 0, hisiHead.length); this.accOutput.read(0, aac, 4, ret); return aac; } else { byte[] aac = new byte[ret]; this.accOutput.read(0, aac, 0, ret); return aac; } } return null; } public synchronized void release() throws Exception { this.closed = true; if (this.aacEncoder != null) { if (Os.window) { FAACLibraryWindow.INSTANCE.faacEncClose(this.aacEncoder); } else { FAACLibraryLinux.INSTANCE.faacEncClose(this.aacEncoder); } this.aacEncoder = null; } if (this.accOutput != null) { // Native.free(Pointer.nativeValue(this.accOutput)); // Pointer.nativeValue(this.accOutput, 0); accOutput.close(); this.accOutput = null; } if (this.resampler != null) { this.resampler.release(); this.resampler = null; } } @Override public byte[] encode(short[] samples) throws Exception { return null; } } class PcmResampler { private ByteArrayInputStream inputStream; private ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); private AudioFormat sourceFormat, targetFormat; private AudioInputStream sourceStream; private byte[] pcm; public PcmResampler(int pcmLen, int channels, int sampleSizeInBits, int sampleRate) { this.pcm = new byte[pcmLen]; this.inputStream = new ByteArrayInputStream(pcm); this.sourceFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 8000, 16, 1, 2, 0, false); int frameSize = channels * sampleSizeInBits / 8; this.targetFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, sampleRate, sampleSizeInBits, channels, frameSize, 0, false); this.sourceStream = new AudioInputStream(this.inputStream, this.sourceFormat, pcm.length / this.sourceFormat.getFrameSize()); } public byte[] resample(byte[] pcm) throws Exception { System.arraycopy(pcm, 0, this.pcm, 0, pcm.length); AudioInputStream targetStream = AudioSystem.getAudioInputStream(this.targetFormat, this.sourceStream); targetStream.transferTo(this.outputStream); targetStream.close(); pcm = this.outputStream.toByteArray(); this.outputStream.reset(); this.inputStream.reset(); this.sourceStream.reset(); return pcm; } public void release() throws Exception { if (this.sourceStream != null) { this.sourceStream.close(); } } } interface FAACLibraryWindow extends Library { @SuppressWarnings("deprecation") FAACLibraryWindow INSTANCE = Native.loadLibrary(FAACEncoder.lib_file, FAACLibraryWindow.class); Pointer faacEncOpen(int sampleRate, int channels, IntByReference inputSamples, IntByReference maxOutputBytes); FaacEncConfiguration faacEncGetCurrentConfiguration(Pointer hEncoder); int faacEncSetConfiguration(Pointer hEncoder, FaacEncConfiguration configuration); int faacEncEncode(Pointer hEncoder, byte[] inputBuffer, int samplesInput, Pointer outputBuffer, int bufferSize); int faacEncClose(Pointer hEncoder); public static class FaacEncConfiguration extends Structure { public int version; public Pointer name; public Pointer copyright; public int mpegVersion; public int aacObjectType; public int useLfe; public int useTns; public int bitRate; public int bandWidth; public int quantqual; public int outputFormat; public Pointer psymodellist; public int psymodelidx; public int inputFormat; public int shortctl; public int[] channel_map = new int[64]; public int pnslevel; @Override protected List<String> getFieldOrder() { return Arrays.asList("version", "name", "copyright", "mpegVersion", "aacObjectType", "useLfe", "useTns", "bitRate", "bandWidth", "quantqual", "outputFormat", "psymodellist", "psymodelidx", "inputFormat", "shortctl", "channel_map", "pnslevel"); } } } interface FAACLibraryLinux extends Library { @SuppressWarnings("deprecation") FAACLibraryLinux INSTANCE = Native.loadLibrary(FAACEncoder.lib_file, FAACLibraryLinux.class); Pointer faacEncOpen(int sampleRate, int channels, IntByReference inputSamples, IntByReference maxOutputBytes); FaacEncConfiguration faacEncGetCurrentConfiguration(Pointer hEncoder); int faacEncSetConfiguration(Pointer hEncoder, FaacEncConfiguration configuration); int faacEncEncode(Pointer hEncoder, byte[] inputBuffer, int samplesInput, Pointer outputBuffer, int bufferSize); int faacEncClose(Pointer hEncoder); public static class FaacEncConfiguration extends Structure { public int version; public Pointer name; public Pointer copyright; public int mpegVersion; public int aacObjectType; public FaacUnion union = new FaacUnion(); // public int jointmode; public int useLfe; public int useTns; public long bitRate; public int bandWidth; public long quantqual; public int outputFormat; public Pointer psymodellist; public int psymodelidx; public int inputFormat; public int shortctl; public int[] channel_map = new int[64]; public int pnslevel; @Override protected List<String> getFieldOrder() { return Arrays.asList("version", "name", "copyright", "mpegVersion", "aacObjectType", "union", "useLfe", "useTns", "bitRate", "bandWidth", "quantqual", "outputFormat", "psymodellist", "psymodelidx", "inputFormat", "shortctl", "channel_map", "pnslevel"); } } public static class FaacUnion extends Union { public int jointmode; public int allowMidside; } } faac的C语言如下: typedef struct faacEncConfiguration { /* config version */ int version; /* library version */ char *name; /* copyright string */ char *copyright; /* MPEG version, 2 or 4 */ unsigned int mpegVersion; /* AAC object type */ unsigned int aacObjectType; union { /* Joint coding mode */ unsigned int jointmode; /* compatibility alias */ unsigned int allowMidside; }; /* Use one of the channels as LFE channel */ unsigned int useLfe; /* Use Temporal Noise Shaping */ unsigned int useTns; /* bitrate / channel of AAC file */ unsigned long bitRate; /* AAC file frequency bandwidth */ unsigned int bandWidth; /* Quantizer quality */ unsigned long quantqual; /* Bitstream output format (0 = Raw; 1 = ADTS) */ unsigned int outputFormat; /* psychoacoustic model list */ psymodellist_t *psymodellist; /* selected index in psymodellist */ unsigned int psymodelidx; /* PCM Sample Input Format 0 FAAC_INPUT_NULL invalid, signifies a misconfigured config 1 FAAC_INPUT_16BIT native endian 16bit 2 FAAC_INPUT_24BIT native endian 24bit in 24 bits (not implemented) 3 FAAC_INPUT_32BIT native endian 24bit in 32 bits (DEFAULT) 4 FAAC_INPUT_FLOAT 32bit floating point */ unsigned int inputFormat; /* block type enforcing (SHORTCTL_NORMAL/SHORTCTL_NOSHORT/SHORTCTL_NOLONG) */ int shortctl; /* Channel Remapping Default 0, 1, 2, 3 ... 63 (64 is MAX_CHANNELS in coder.h) WAVE 4.0 2, 0, 1, 3 WAVE 5.0 2, 0, 1, 3, 4 WAVE 5.1 2, 0, 1, 4, 5, 3 AIFF 5.1 2, 0, 3, 1, 4, 5 */ int channel_map[64]; int pnslevel; } faacEncConfiguration, *faacEncConfigurationPtr; 测试类如下: package com.comname.video; import java.io.ByteArrayOutputStream; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.IOException; import com.comname.video.common.coder.FAACEncoder; import com.comname.video.common.util.ByteUtil; import io.netty.buffer.ByteBuf; import io.netty.buffer.Unpooled; public class Test { public static void main(String[] args) throws Exception { System.out.println("start"); String hex = "FFF16C4060000001402280A37FF8852D"; byte[] aac = ByteUtil.hexToByte(hex); FAACEncoder encoder = FAACEncoder.create("sim", aac, 8000, 0, 1, false, null); encoder.start(); ByteBuf pcms = Unpooled.buffer(); pcms.writeBytes(getBytes("/opt/Programs/GpsVideo/log/aaaa.pcm")); FileOutputStream fos = new FileOutputStream("./aaaa.aac"); while (pcms.readableBytes() > 0) { byte[] pcm = new byte[Math.min(pcms.readableBytes(), encoder.pcmLen)]; pcms.readBytes(pcm); aac = encoder.encode(pcm); if (aac != null) { fos.write(aac); } } fos.flush(); fos.close(); encoder.release(); System.out.println("over"); while (true) { Thread.sleep(1000); } } public static byte[] getBytes(String filePath) { byte[] buffer = null; try { File file = new File(filePath); FileInputStream fis = new FileInputStream(file); ByteArrayOutputStream bos = new ByteArrayOutputStream(1000); byte[] b = new byte[1000]; int n; while ((n = fis.read(b)) != -1) { bos.write(b, 0, n); } fis.close(); bos.close(); buffer = bos.toByteArray(); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return buffer; } } 其中aaaa.pcm文件是采样率8000,16位单通道的PCM数据,可正常播放 生成的aaaa.aac播放都是杂音,请问是哪里出了问题?
最新发布
08-21
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值