Tutorial 06: Synching Audio

本文深入探讨了音视频播放器中同步技术的实现细节,包括如何使用内部时钟追踪视频播放进度,以及如何调整音频播放速率以保持与视频同步。文章还介绍了如何抽象时钟源以支持多种同步方式。


Synching Audio

So now we have a decent enough player to watch a movie, so let's see what kind of loose ends we have lying around. 

现在我们已经有了一个像样的音视频播放器,那让我们看看还有哪些零碎的地方;

Last time, we glossed over synchronization a little bit, namely sychronizing audio to a video clock rather than the other way around. 

上次,有关同步的地方没有说详细,也就是说同步音频到视频而不是相反的方向;

We're going to do this the same way as with the video: make an internal video clock to keep track of how far along the video thread is and sync the audio to that. 

我们会用同样的方法,使用一个外部时钟持续追踪视频线程播放时间,并将视频同步到音频;

Later we'll look at how to generalize things to sync both audio and video to an external clock, too.

后面,我们还要看看如何将音频或者视频都同步到外部的时钟;


Implementing the video clock

Now we want to implement a video clock similar to the audio clock we had last time: 

现在我们要实现一个和上次使用的音频时钟类似的视频时钟:

an internal value that gives the current time offset of the video currently being played. 

一个外部的值,该值可以表示当前时间相对视频已播放时间的偏移;

At first, you would think that this would be as simple as updating the timer with the current PTS of the last frame to be shown. 

首先,这很简单,就像更新上一帧显示的时间和当前帧的PTS一样;

However, don't forget that the time between video frames can be pretty long when we get down to the millisecond level. 

然而,不要忘记,相邻2个视频帧的时间可能大于微妙级别的;

The solution is to keep track of another value, the time at which we set the video clock to the PTS of the last frame. 

解决方案是持续追踪其他的值,有关视频时钟的PTS值和上一帧的差;

That way the current value of the video clock will be PTS_of_last_frame + (current_time - time_elapsed_since_PTS_value_was_set)

当前video clock可能是PTS_of_last_frame + (current_time - time_elapsed_since_PTS_value_was_set);

This solution is very similar to what we did with get_audio_clock.

方法和get_audio_clock类似;

So, in our big struct, we're going to put a double video_current_pts and a int64_t video_current_pts_time

因此,在我们的大结构里面,需要添加video_current_pts和video_current_pts_time;

The clock updating is going to take place in the video_refresh_timer function:

更新时钟是放在video_refresh_timer函数里面;

void video_refresh_timer(void *userdata) {

  /* ... */

  if(is->video_st) {
    if(is->pictq_size == 0) {
      schedule_refresh(is, 1);
    } else {
      vp = &is->pictq[is->pictq_rindex];

      is->video_current_pts = vp->pts;
      is->video_current_pts_time = av_gettime();
Don't forget to initialize it in stream_component_open:

不要忘记在stream_component_open里面初始化该值:

is->video_current_pts_time = av_gettime();
And now all we need is a way to get the information:
double get_video_clock(VideoState *is) {
  double delta;

  delta = (av_gettime() - is->video_current_pts_time) / 1000000.0;
  return is->video_current_pts + delta;
}




Abstracting the clock

But why force ourselves to use the video clock? 
但是为什么强制我们使用video clock呢?
We'd have to go and alter our video sync code so that the audio and video aren't trying to sync to each other. 
我们必须变更视频同步代码,这样视频或音频不用都想要和对方同步;
Imagine the mess if we tried to make it a command line option like it is in ffplay.
如果我们做的和ffplay里面的命令行一样,你可以想象那是有点乱糟糟的;
So let's abstract things: we're going to make a new wrapper function, 
因此,让我们把这个事情抽象来看,我们即将封装一个新的函数;
get_master_clock that checks an av_sync_type variable and then call get_audio_clock,get_video_clock, or whatever other clock we want to use. 
get_master_clock检查av_sync_type变量,然后调用get_audio_clock和get_video_clock或者其他我们使用的时钟;
We could even use the computer clock, which we'll call get_external_clock:
我们也使用计算机时钟,这里被称为get_external_clock;

enum {
  AV_SYNC_AUDIO_MASTER,
  AV_SYNC_VIDEO_MASTER,
  AV_SYNC_EXTERNAL_MASTER,
};

#define DEFAULT_AV_SYNC_TYPE AV_SYNC_VIDEO_MASTER

double get_master_clock(VideoState *is) {
  if(is->av_sync_type == AV_SYNC_VIDEO_MASTER) {
    return get_video_clock(is);
  } else if(is->av_sync_type == AV_SYNC_AUDIO_MASTER) {
    return get_audio_clock(is);
  } else {
    return get_external_clock(is);
  }
}
main() {
...
  is->av_sync_type = DEFAULT_AV_SYNC_TYPE;
...
}

Synchronizing the Audio

Now the hard part: synching the audio to the video clock.

现在是最难的部分了:将音频同步到视频;

Our strategy is going to be to measure where the audio is, compare it to the video clock, 

我们的策略是测试音频播放的位置,将其和video clock比较,

and then figure out how many samples we need to adjust by, that is, do we need to speed up by dropping samples or do we need to slow down by adding them?

然后计算出已播出的采样,我们是否需要丢弃一些采样来加速播放 或者 需要增加一些采样来降低播放速度;

We're going to run a synchronize_audio function each time we process each set of audio samples we get to shrink or expand them properly. 

每次想要缩短采样或者扩充采样的时候,我们使用synchronize_audio函数来处理;

However, we don't want to sync every single time it's off because process audio a lot more often than video packets. 

然而,我们不想要每次都同步,因为处理音频比处理视频更加频繁;

So we're going to set a minimum number of consecutive calls to the synchronize_audio function that have to be out of sync before we bother doing anything. 

因此,我们设置了一个最小的连续调用synchronize_audio的时间间隔,该函数在做任何操作之前必须要先同步;

Of course, just like last time, "out of sync" means that the audio clock and the video clock differ by more than our sync threshold.'

当然,和上次一样,"out of sync"意味着audio clock和video clock不同于我们的同步阀值;

So we're going to use a fractional coefficient, say c, and So now let's say we've gotten N audio sample sets that have been out of sync. 

但是我们打算使用一个分数率,比如c,因此我们可以说我们已经得到N个音频采样来进行同步;

The amount we are out of sync can also vary a good deal, so we're going to take an average of how far each of those have been out of sync. 

同步的数量也可能会很大,我们打算计算这些已经同步的技术的平均值;

So for example, the first call might have shown we were out of sync by 40ms, the next by 50ms, and so on. 

因此举例来说,第一次调用可能说明我们差40ms同步,第二次50ms等等;

But we're not going to take a simple average because the most recent values are more important than the previous ones. 

但是我们不打算仅做一个简单的平均值,因此最新的值比之前的值更加重要;

So we're going to use a fractional coefficient, say c, and sum the differences like this: diff_sum = new_diff + diff_sum*c

我们打算使用一个分数系数,比如c,差距的统计结果为:diff_sum = new_diff + diff_sum * c;

When we are ready to find the average difference, we simply calculate avg_diff = diff_sum * (1-c).

当我们准备好了发现这些平均值,只要简单的通过avg_diff = diff_sum * (1 - c)来计算;

Here's what our function looks like so far:

下面是我们的函数现在看到的样子:

/* Add or subtract samples to get a better sync, return new
   audio buffer size */
int synchronize_audio(VideoState *is, short *samples,
		      int samples_size, double pts) {
  int n;
  double ref_clock;
  
  n = 2 * is->audio_st->codec->channels;
  
  if(is->av_sync_type != AV_SYNC_AUDIO_MASTER) {
    double diff, avg_diff;
    int wanted_size, min_size, max_size, nb_samples;
    
    ref_clock = get_master_clock(is);
    diff = get_audio_clock(is) - ref_clock;

    if(diff < AV_NOSYNC_THRESHOLD) {
      // accumulate the diffs
      is->audio_diff_cum = diff + is->audio_diff_avg_coef
	* is->audio_diff_cum;
      if(is->audio_diff_avg_count < AUDIO_DIFF_AVG_NB) {
	is->audio_diff_avg_count++;
      } else {
	avg_diff = is->audio_diff_cum * (1.0 - is->audio_diff_avg_coef);

       /* Shrinking/expanding buffer code.... */

      }
    } else {
      /* difference is TOO big; reset diff stuff */
      is->audio_diff_avg_count = 0;
      is->audio_diff_cum = 0;
    }
  }
  return samples_size;
}
So we're doing pretty well; we know approximately how off the audio is from the video or whatever we're using for a clock. 

因此我们已经做的很好了:我们知道audio相对于video大约相差多少,或者我们什么时候使用时钟;

So let's now calculate how many samples we need to add or lop off by putting this code where the "Shrinking/expanding buffer code" section is:

我们需要计算我们需要在“Shrinking/expanding buffer code”这部分代码里面添加或者砍掉多少采样;

if(fabs(avg_diff) >= is->audio_diff_threshold) {
  wanted_size = samples_size + 
  ((int)(diff * is->audio_st->codec->sample_rate) * n);
  min_size = samples_size * ((100 - SAMPLE_CORRECTION_PERCENT_MAX)
                             / 100);
  max_size = samples_size * ((100 + SAMPLE_CORRECTION_PERCENT_MAX) 
                             / 100);
  if(wanted_size < min_size) {
    wanted_size = min_size;
  } else if (wanted_size > max_size) {
    wanted_size = max_size;
  }
Remember that audio_length * (sample_rate * # of channels * 2) is the number of samples in audio_length seconds of audio.
记住audio_length * (sample_rate * # of channels * 2)是audio_length秒的音频数据数目;

Therefore, number of samples we want is going to be the number of samples we already have plus or minus the number of samples that correspond to the amount of time the audio has drifted. 

因此,我们想要的采样数目就是我们已经加上或者减去的采样数目,而这些和已经消耗的音频数目是一样的;

We'll also set a limit on how big or small our correction can be because if we change our buffer too much, it'll be too jarring to the user.
我们也可以设置我们的修正误差的大小,因为改变buffer太快了,这对用户而言是不和谐的;



Correcting the number of samples









【电力系统】单机无穷大电力系统短路故障暂态稳定Simulink仿真(带说明文档)内容概要:本文档围绕“单机无穷大电力系统短路故障暂态稳定Simulink仿真”展开,提供了完整的仿真模型与说明文档,重点研究电力系统在发生短路故障后的暂态稳定性问题。通过Simulink搭建单机无穷大系统模型,模拟不同类型的短路故障(如三相短路),分析系统在故障期间及切除后的动态响应,包括发电机转子角度、转速、电压和功率等关键参数的变化,进而评估系统的暂态稳定能力。该仿真有助于理解电力系统稳定性机理,掌握暂态过程分析方法。; 适合人群:电气工程及相关专业的本科生、研究生,以及从事电力系统分析、运行与控制工作的科研人员和工程师。; 使用场景及目标:①学习电力系统暂态稳定的基本概念与分析方法;②掌握利用Simulink进行电力系统建模与仿真的技能;③研究短路故障对系统稳定性的影响及提高稳定性的措施(如故障清除时间优化);④辅助课程设计、毕业设计或科研项目中的系统仿真验证。; 阅读建议:建议结合电力系统稳定性理论知识进行学习,先理解仿真模型各模块的功能与参数设置,再运行仿真并仔细分析输出结果,尝试改变故障类型或系统参数以观察其对稳定性的影响,从而深化对暂态稳定问题的理解。
本研究聚焦于运用MATLAB平台,将支持向量机(SVM)应用于数据预测任务,并引入粒子群优化(PSO)算法对模型的关键参数进行自动调优。该研究属于机器学习领域的典型实践,其核心在于利用SVM构建分类模型,同时借助PSO的全局搜索能力,高效确定SVM的最优超参数配置,从而显著增强模型的整体预测效能。 支持向量机作为一种经典的监督学习方法,其基本原理是通过在高维特征空间中构造一个具有最大间隔的决策边界,以实现对样本数据的分类或回归分析。该算法擅长处理小规模样本集、非线性关系以及高维度特征识别问题,其有效性源于通过核函数将原始数据映射至更高维的空间,使得原本复杂的分类问题变得线性可分。 粒子群优化算法是一种模拟鸟群社会行为的群体智能优化技术。在该算法框架下,每个潜在解被视作一个“粒子”,粒子群在解空间中协同搜索,通过不断迭代更新自身速度与位置,并参考个体历史最优解和群体全局最优解的信息,逐步逼近问题的最优解。在本应用中,PSO被专门用于搜寻SVM中影响模型性能的两个关键参数——正则化参数C与核函数参数γ的最优组合。 项目所提供的实现代码涵盖了从数据加载、预处理(如标准化处理)、基础SVM模型构建到PSO优化流程的完整步骤。优化过程会针对不同的核函数(例如线性核、多项式核及径向基函数核等)进行参数寻优,并系统评估优化前后模型性能的差异。性能对比通常基于准确率、精确率、召回率及F1分数等多项分类指标展开,从而定量验证PSO算法在提升SVM模型分类能力方面的实际效果。 本研究通过一个具体的MATLAB实现案例,旨在演示如何将全局优化算法与机器学习模型相结合,以解决模型参数选择这一关键问题。通过此实践,研究者不仅能够深入理解SVM的工作原理,还能掌握利用智能优化技术提升模型泛化性能的有效方法,这对于机器学习在实际问题中的应用具有重要的参考价值。 资源来源于网络分享,仅用于学习交流使用,请勿用于商业,如有侵权请联系我删除!
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值