LIVE555-H264媒体流传输(4) SINK_SOURCE

最新推荐文章于 2024-03-28 11:42:37 发布

镇住的新山

最新推荐文章于 2024-03-28 11:42:37 发布

阅读量1.6k

点赞数 4

CC 4.0 BY-SA版权

分类专栏：流媒体文章标签： live555 RTP 流媒体 sink h.264

本文链接：https://blog.youkuaiyun.com/liyuanba2dai/article/details/105786247

流媒体专栏收录该内容

5 篇文章

订阅专栏

本文详细解析了RTSP协议下H264视频流的传输过程，从客户端发送PLAY请求开始，深入介绍了媒体流传输涉及的关键类如MediaSink、FrameSource、StreamParser和FramedFilter的作用，以及数据从sink请求到最终打包发送的流程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

客户端通过RTSP协议发送SETUP请求，建立了对应的客户端会话（ClientSession),具体过程有不熟悉的可以参考上一篇文章，当客户端继续发送PLAY请求时，服务器就解析命令信息进行多媒体的数据拼装推流，接着上一篇文章，以RTSPClientSession::handleCmd_PLAY函数为入口，继续讲解媒体流的传输过程。

1.基本概念

为了便于理解，让咱们先来简单理解一下，后面涉及到的一些类及其基本作用。

MediaSink：RTP数据包的打包者负责将采集到的流媒体数据集添加RTP包头等RTP信息处理，以及采集数据过大需要切片等操作。需要注意其有一个成员变量fSource指向Mediasource,此对象用于生成视频流数据。

FrameSource:顾名思义是RTP数据源类，用于读取或获取多媒体流数据。

StreamParser:数据源解析器，其也有一个成员变量指向所获取的源对象，用于将source采集的源数据进行解析。

FramedFilter:数据过滤器，其也有一个成员变量指向所获取的源对象，对source数据进行二次加工。

在live555中数据是层层请求数据，数据请求完成后，再层层包装数据，最终将数据发送出去，其数据流向如下图所示，是一个从sink请求，到最后由sink打包发出的过程。

ok，有了上述的概念下面开始讲述H264的数据流传输过程。

2.Sink开始播放

本次主要针对H264文件的播放，先看一下H264VideoRTPSink的继承关系

可以看到此类中并没有StartPlaying()函数，而是调用的其基类 MediaSink函数，其代码如下：

Boolean MediaSink::startPlaying(MediaSource& source,
                afterPlayingFunc* afterFunc,
                void* afterClientData) 
{
//!确定还没有开始播放
  // Make sure we're not already being played:
  if (fSource != NULL) {
    envir().setResultMsg("This sink is already being played");
    return False;
  }

  // Make sure our source is compatible:
  if (!sourceIsCompatibleWithUs(source)) {
    envir().setResultMsg("MediaSink::startPlaying(): source is not compatible!");
    return False;
  }
  //! 绑定source
  fSource = (FramedSource*)&source;
 //!注册播放完成后的回调函数和回调对象
  fAfterFunc = afterFunc;
  fAfterClientData = afterClientData;
  //！ 继续播放
  return continuePlaying();
}

live555中会有很多此类函数，我们可以理解为，当播放完成后调用我指定的回调函数。

首先是绑定上级传入的source和各级回调函数，最后调用continuePlaying()函数，经过此轮函数后，此时H264VideoRtpSink的fSource指向H264VideoStreamFramer对象，。

接下来调用continuePlaying()函数，此函数在mediaSink 中纯虚函数，此时跳转到基类H264or5VideoRTPSink::continuePlaying()函数中，代码如下：

Boolean H264or5VideoRTPSink::continuePlaying() {
  // First, check whether we have a 'fragmenter' class set up yet.
  // If not, create it now:
//!创建段划分器，最大输入的内存和最大发送包内存（除去RTP::HDR）
  if (fOurFragmenter == NULL) {
    fOurFragmenter = new H264or5Fragmenter(fHNumber, envir(), fSource, OutPacketBuffer::maxSize,
					   ourMaxPacketSize() - 12/*RTP hdr size*/);
  } else {
    fOurFragmenter->reassignInputSource(fSource);
  }
  //!将输入源改为段划分器
  fSource = fOurFragmenter;

  // Then call the parent class's implementation:
  return MultiFramedRTPSink::continuePlaying();

在此函数中创建了H264or5Fragmenter对象，并将其设置为sink的soure，此函数执行后可以看到sink对象的source变为 H264or5Fragmenter对象，ok不再赘述，下面进入函数MultiFramedRTPSink::continuePlaying()中，其中会调用buildAndSendPacket（）函数，进行创建rtp包数据，代码如下


void MultiFramedRTPSink::buildAndSendPacket(Boolean isFirstPacket) {
  nextTask() = NULL;
  fIsFirstPacket = isFirstPacket;

  /*
  0               1             2               3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |V=2|P|X|  CC   |M|     PT      |       sequence number         |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                           timestamp                           |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |           synchronization source (SSRC) identifier            |
  +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
  |            contributing source (CSRC) identifiers             |
  |                             ....                              |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  版本号（V）：2比特，此域定义了RTP的版本。此协议定义的版本是2
  填充位（P）：1比特，如果该位置位，则该RTP包的尾部就包含附加的填充字节
  扩展位（X）：1比特，如果该位置位的话，RTP固定头部后面就跟有一个扩展头部
  CSRC计数器（CC）：4比特，含有固定头部后面跟着的CSRC的数目
  标记位（M）：1比特,该位的解释由配置文档（Profile）来承担.

  载荷类型（PT）：7比特，标识了RTP载荷的类型?
  序列号（SN）：16比特，发送方在每发送完一个RTP包后就将该域的值增
  时间戳：32比特，记录了该包中数据的第一个字节的采样时刻。在一次会话开始时，时间戳初始化成一个初始值    
           时间戳的数值也要随时间而不断地增加时间戳是去除抖动和实现同步不可缺少的
  同步源标识符(SSRC):32比特，同步源就是指RTP包流的来源。在同一个RTP会话中不能有两个相同的SSRC值               
                     该标识符是随机选取皿RFC1889推荐了MD5随机算法
  贡献源列表（CSRC List）：5项，每项32比特，用来标志对一个RTP混合器产生的新包有贡献的所有RTP包的                    
                         。由混合器将这些有贡献的SSRC标识符插入表中。SSRC标识符都被列出来，以便接收端能正确指出交谈双方的身份〿
  */
  // Set up the RTP header:32  
  unsigned rtpHdr = 0x80000000; // RTP version 2; marker ('M') bit not set (by default; it can be set later)
  // ！载荷类型
  rtpHdr |= (fRTPPayloadType<<16);
  //序列号
  rtpHdr |= fSeqNo; // sequence number
  fOutBuf->enqueueWord(rtpHdr);

  // Note where the RTP timestamp will go.
  // (We can't fill this in until we start packing payload frames.)
  fTimestampPosition = fOutBuf->curPacketSize();
  //！预留时间戳地址
  fOutBuf->skipBytes(4); // leave a hole for the timestamp
 //！添加同步源标识
  fOutBuf->enqueueWord(SSRC());

  // Allow for a special, payload-format-specific header following the
  // RTP header:
  fSpecialHeaderPosition = fOutBuf->curPacketSize();
  fSpecialHeaderSize = specialHeaderSize();
  fOutBuf->skipBytes(fSpecialHeaderSize);

  // Begin packing as many (complete) frames into the packet as we can:
  fTotalFrameSpecificHeaderSizes = 0;
  fNoFramesLeft = False;
  fNumFramesUsedSoFar = 0;
  packFrame();
}

上述代码用于拼装RTP报文头，拼装完成后。

内存fOutBuf:此处出现了内存的操作，为了便于理解内存拷贝复制过程，这里对内存流程进行描述，sink的成员变量fOutBuf是在其基类(MultiFramedRTPSink)构造函数中初始化的，调用SetPacketSizes函数，

其会调用函数，fOutBuf = new OutPacketBuffer(preferredPacketSize, maxPacketSize);在OutPacketBuffer构造函数中会进行内存的申请，如下会根据最大的包大小，及最大内存(maxSize)大小,申请内存，后续sink对象操作内存都是操作此块内存。

内存描述完了，接下来言归正传，继续打包媒体帧数据,进入packFrame()函数。其代码如下：


void MultiFramedRTPSink::packFrame() {
  // Get the next frame.
		
  // 特殊头信息处理

  // See if we have an overflow frame that was too big for the last pkt
  //!是否上次数据量太多，存在溢出数据
  if (fOutBuf->haveOverflowData()) {
    // Use this frame before reading a new one from the source
    unsigned frameSize = fOutBuf->overflowDataSize();
    struct timeval presentationTime = fOutBuf->overflowPresentationTime();
    unsigned durationInMicroseconds = fOutBuf->overflowDurationInMicroseconds();
    fOutBuf->useOverflowData();

	//！继续调用图像帧数据
    afterGettingFrame1(frameSize, 0, presentationTime, durationInMicroseconds);
  } else {
	  //获取新的一帧数据
    // Normal case: we need to read a new frame from the source
    if (fSource == NULL) return;
    fSource->getNextFrame(fOutBuf->curPtr(), fOutBuf->totalBytesAvailable(),
			  afterGettingFrame, this, ourHandleClosure, this);
  }
}

该函数判断是否由溢出数据，存在则先处理溢出数据，不存在则获取新的一帧数据，因为咱们从讲述的是从无到有的流程，所以继续讲述获取帧数据的流程。

3.Source获取数据

注意参照表格，此处fSource指向的是H264or5Fragmenter对象的地址，调用其基类 FramedSource::getNextFrame（），其简化代码如下：

void FramedSource::getNextFrame(unsigned char* to, unsigned maxSize,
				afterGettingFunc* afterGettingFunc,
				void* afterGettingClientData,
				onCloseFunc* onCloseFunc,
				void* onCloseClientData) 
{
  fTo = to;
  fMaxSize = maxSize;
  fNumTruncatedBytes = 0; // by default; could be changed by doGetNextFrame()
  fDurationInMicroseconds = 0; // by default; could be changed by doGetNextFrame()
  fAfterGettingFunc = afterGettingFunc;
  fAfterGettingClientData = afterGettingClientData;
  fOnCloseFunc = onCloseFunc;
  fOnCloseClientData = onCloseClientData;
  fIsCurrentlyAwaitingData = True;

  doGetNextFrame();
}

live555中会有很多此类函数，我们可以理解为，命令自己源对象获取帧数据，获取完成后调用我指定的回调函数。

内存fTO:fTo赋值为Sink中fOutBuf拼装rtp后的内存数组地址（arry[0]+curOff），fMaxSize等于fOutBuf剩余的内存大小.

此函数主要为记录回调函数等数据，然后进行自己的处理，经过此函数后函数指针指向如下：

注：蓝色线为获取数据后的回调指针，橙色为获取内存前函数指针。

函数最后调用 doGetNextFrame()，这个函数是一个重载函数，此时进入H264or5Fragmenter此类中，进行数据处理，代码如下

void H264or5Fragmenter::doGetNextFrame() {
   //！无内存数据
  if (fNumValidDataBytes == 1) {
    // We have no NAL unit data currently in the buffer.  Read a new one:
    fInputSource->getNextFrame(&fInputBuffer[1], fInputBufferSize - 1,
			       afterGettingFrame, this,
			       FramedSource::handleClosure, this);
  } else {
    ....

  }
}

当内存中无数据时，通过source获取新的一帧数据，否则进行分割打包（else后续会讲），

内存fInputBuffer：在H264or5Fragmenter构造时，申请内存，fInputBuffer = new unsigned char[fInputBufferSize];，其内存大小为OutPacketBuffer::maxSize。

注意H264or5Fragmenter对象的fource指向H264VideoStreamFramer对象的指针，H264VideoStreamFramer对象调用GetNextFrame()函数，同样是调用基类FramedSource::getNextFrame函数，参见上面贴出的代码，主要是记录回调信息和指向自己doGetNextFrame函数，经过此函数后，函数指针指向如下：

明白函数指针指向情况后，就可以继续往下查看了，调用的H264VideoStreamFramer的基类的函数doGetNextFrame,代码如下：


void H264or5VideoStreamFramer::doGetNextFrame() {
  if (fInsertAccessUnitDelimiters && pictureEndMarker()) {
  //！略过此处
  } else {
    // Do the normal delivery of a NAL unit from the parser:
    MPEGVideoStreamFramer::doGetNextFrame();
  }
}

继续进入查看，自己跟踪代码可以看到进入MPEGVideoStreamFramer::continueReadProcessing()函数中，代码如下：

void MPEGVideoStreamFramer::continueReadProcessing() {
  unsigned acquiredFrameSize = fParser->parse();
  if (acquiredFrameSize > 0) {

    fFrameSize = acquiredFrameSize;
    fNumTruncatedBytes = fParser->numTruncatedBytes();
    fDurationInMicroseconds
      = (fFrameRate == 0.0 || ((int)fPictureCount) < 0) ? 0
      : (unsigned)((fPictureCount*1000000)/fFrameRate);

    fPictureCount = 0;
    afterGetting(this);
  } else {
    // We were unable to parse a complete frame from the input, because:
    // - we had to read more data from the source stream, or
    // - the source stream has ended.
  }
}

4.parser解析数据

此处出现fParser对象,此对象是在H264or5VideoStreamFramer的构造函数中创建的，接下来进入parse函数，简化代码如下：

unsigned H264or5VideoStreamParser::parse()
{
   //1.读取媒体数据
    if (!fHaveSeenFirstStartCode) 
    {
      while ((first4Bytes = test4Bytes()) != 0x00000001) {
	    get1Byte(); 
      }
    }
  
    //2.从读取数据中获取一个NALU写入fTO
    while (next4Bytes != 0x00000001 && (next4Bytes&0xFFFFFF00) != 0x00000100)
    {
	  ......
	  save4Bytes(next4Bytes);
	  skipBytes(4);
    } 

     nal_unit_type = fFirstByteOfNALUnit&0x1F;
    // 3.我们已经获取了一个NAL单元，存到内存中了
	// 根据nal_unit_type解析VPS、SPS、PPS、SEI等
    if (isVPS(nal_unit_type)){ // Video parameter set}
    else if (isSPS(nal_unit_type)) { } 
    else if (isPPS(nal_unit_type)) { // Picture parameter set}
    else if (isSEI(nal_unit_type)) { // Supplemental enhancement information (SEI)}

    // 4.判断NAL单元是否结束，用于设置RTP包M标识
    Boolean thisNALUnitEndsAccessUnit;
    if (haveSeenEOF() || isEOF(nal_unit_type)) {
      thisNALUnitEndsAccessUnit = True;
    } 
	
	//!5.此NALU是结束单元，则增加帧数，设置M标识为TRUE
    if (thisNALUnitEndsAccessUnit)
    {
      usingSource()->fPictureEndMarker = True;
      ++usingSource()->fPictureCount;
    }

    //！6.返回读取的数据大小
    return curFrameSize();
  
}

注意第一步中，test4Bytes()函数，此函数在进入多次跳转后，会调用StreamParser::ensureValidBytes1()函数，代码如下：


void StreamParser::ensureValidBytes1(unsigned numBytesNeeded) {
  // 1.首先判断当前内存块用完，则切换内存块
  if (fCurParserIndex + numBytesNeeded > BANK_SIZE) {
    //！ 切换内存块
  }

  // 通过fInputSource，读取数据最多的数据
  unsigned maxNumBytesToRead = BANK_SIZE - fTotNumValidBytes;
  fInputSource->getNextFrame(&curBank()[fTotNumValidBytes],
			     maxNumBytesToRead,
			     afterGettingBytes, this,
			     onInputClosure, this);
}

内存CurBank：在Streanparser中分配了两个内存块，交替使用，当其中一个用完之后切换到另外一个内存块中。内存块大小由宏BANK_SIZE 150000决定。

上述代码中，在H264or5VideoStreamParser中调用的，他的fInputSource是指向的是H264or5VideoStreamFramer的inputsource，在对象构造的时赋值的，经过调用getNexyFrame后，H264or5VideoStreamParser 函数指针指向如下；

ByteStreamFileSource函数指针情况如下：

了解了函数指针情况就可以继续下一步处理了，调用byteStreamFileSource对象的doGetNextFrame函数，此函数中调用了doReadFromFile();函数，此函数代码如下：

void ByteStreamFileSource::doReadFromFile() {
  // 1.从文件中读取数据
 ......

  //!文件读取完成后,将后续的操作放入调度中，延时时间为0s
  nextTask() = envir().taskScheduler().scheduleDelayedTask(0,
				(TaskFunc*)FramedSource::afterGetting, this);
}

此对象才是真正的读取文件的source，首先将文件内容读取到fTO中即parser中的curBank中，读取成功后，防止死循环，将其加入调度中，回调函数为FramedSource::afterGetting,和this指针，延时为0s.

至此，streamState.playing（）函数调用结束，函数逐级返回，栈信息逐步析构，等待调度响应。

5.数据回传

上述过程执行完成后，会通过调度模块，立马调用回调函数，此过程相当于一个追溯的过程。

1）首先进入FramedSource::afterGetting函数，指针对象为byteStreamFileSource，函数代码如下

void FramedSource::afterGetting(FramedSource* source) {
  source->nextTask() = NULL;
  source->fIsCurrentlyAwaitingData = False;

  if (source->fAfterGettingFunc != NULL) {
    (*(source->fAfterGettingFunc))(source->fAfterGettingClientData,
				   source->fFrameSize, source->fNumTruncatedBytes,
				   source->fPresentationTime,
				   source->fDurationInMicroseconds);
  }
}

此函数主要是一个跳转功能，参照上述记录的函数指针情况，可以找到对应指针。

2)然后调用的函数为StreamParser::afterGettingBytes,指针对象为H264or5VideoStreamParser，此函数会调用StreamParser::afterGettingBytes1函数，此函数代码如下：

//!numBytesRead 已读取字节数 
void StreamParser::afterGettingBytes1(unsigned numBytesRead, struct timeval presentationTime) {
  fLastSeenPresentationTime = presentationTime;
 //！记录读取数据
  unsigned char* ptr = &curBank()[fTotNumValidBytes];
  fTotNumValidBytes += numBytesRead;

  // Continue our original calling source where it left off:
  restoreSavedParserState();
   //！调用回调函数
  fClientContinueFunc(fClientContinueClientData, ptr, numBytesRead, presentationTime);
}

通过第4节的函数指针图，可以看到此时调用函数和指针，此处注意传入的指针为解析对象的CurBank内存块。

3)调用MPEGVideoStreamFramer：：continueReadProcessing()函数，指针对象为H264or5VideoStreamFramer,此函数会调用continueReadProcessing函数，进入对应函数代码如下：

void MPEGVideoStreamFramer::continueReadProcessing() {
//！获取解析出来的帧数据大小
  unsigned acquiredFrameSize = fParser->parse();
  if (acquiredFrameSize > 0) {
    fFrameSize = acquiredFrameSize;
    fNumTruncatedBytes = fParser->numTruncatedBytes();

    //继续调用上级处理
    afterGetting(this);
  } else {}
}

此处调用H264or5VideoStreamParser的parse函数，在上面已经讲解，此处主要是解析h264格式，把读取到curbank中的数据转存到fTO中，即保存到H264or5Fragmenter中fInputBuffer中，数据最后调用afterGetting函数，上面已经列出此函数代码，主要是根据指针回调，H264or5VideoStreamFramer回调函数由图可知。

4)调用H264or5Fragmenter：：afterGettingFrame()函数，指针对象为H264or5Fragmenter,此函数后续会调用doGetNextFrame函数，进入对应函数代码如下：

void H264or5Fragmenter::doGetNextFrame() {
//！无帧数据
  if (fNumValidDataBytes == 1) {
    //...
  } 
  //!存在帧数据
  else
 {
    // 三种情况处理，拼装处理
    // We have NAL unit data in the buffer.  There are three cases to consider:
    // 1. There is a new NAL unit in the buffer, and it's small enough to deliver
    //    to the RTP sink (as is).
    // 2. There is a new NAL unit in the buffer, but it's too large to deliver to
    //    the RTP sink in its entirety.  Deliver the first fragment of this data,
    //    as a FU packet, with one extra preceding header byte (for the "FU header").
    // 3. There is a NAL unit in the buffer, and we've already delivered some
    //    fragment(s) of this.  Deliver the next fragment of this data,
    //    as a FU packet, with two (H.264) or three (H.265) extra preceding header bytes
    //    (for the "NAL header" and the "FU header").

      memmove(fTo, &fInputBuffer[fCurDataOffset-numExtraHeaderBytes], numBytesToSend);
    // 拼装完成回溯
    FramedSource::afterGetting(this);
  }
}

在第3步中已经讲述了没有数据的处理情况，当有数据时则对数据进行分段打包处理，将数据从fInputBuffer转存到fTo中，即H264VideoRTPSink中的fOutBuf中。调用afterGetting函数,继续回溯函数，由函数指针可知，

5)调用MultiFramedRTPSink::afterGettingFrame函数，指针对象为MultiFramedRTPSink,此函数后续会调用afterGettingFrame1函数，进入对应函数代码如下：

void MultiFramedRTPSink
::afterGettingFrame1(unsigned frameSize, unsigned numTruncatedBytes,
		     struct timeval presentationTime,
		     unsigned durationInMicroseconds
 {
 
  if (numFrameBytesToUse == 0 && frameSize > 0) 
  {
    // Send our packet now, because we have filled it up:
    sendPacketIfNecessary();
   } 
   else {
      // There's room for more frames; try getting another:
      packFrame();
    }
}

调用sendPacketIfNecessary进行发送数据，其函数代码如下：


void MultiFramedRTPSink::sendPacketIfNecessary() {
  if (fNumFramesUsedSoFar > 0) {
  //通过rtpInstance发送打包数据
      if (!fRTPInterface.sendPacket(fOutBuf->packet(), fOutBuf->curPacketSize())) {
	
    //!计算延迟时间
    // 将下一次发送加入到调度中
    nextTask() = envir().taskScheduler().scheduleDelayedTask(uSecondsToGo, (TaskFunc*)sendNext, this);
  }
}

看一看到此函数将rtp发送，并将下一次发送的调度加入。

至此，第一次RTP数据包发送完成，接下来进入后续循环发送。

6.后续循环发送

将SendNext加入调度后，等待触发，其函数如下：

void MultiFramedRTPSink::sendNext(void* firstArg) {
  MultiFramedRTPSink* sink = (MultiFramedRTPSink*)firstArg;
  sink->buildAndSendPacket(False);
}

其会调用buildAndSendPacket函数，false表示不是第一次发送，此时数据进进入了sink播放了，此时就与第一节讲述的对接起来，后续不停循环播放。

ok.此次讲解弱化了很多细节，主要为了描述数据流动的过程，如有不妥之处，欢迎拍砖。