Fundamentals of Embedded Video Processing (Part 2 of a 5-part series)
嵌入式视频处理基本原理 ( 第2部分,共5部分)
By David Katz and Rick Gentile, ADI公司
In this article, the second installment in a five-part series, we’ll discuss the basic concepts of digital video. Before we do this, however, we need to explain a few things about color spaces.
本文是5部分系列文章的第2部分,在本文中,我们将讨论数字视频的基本概念。在讨论前,我们需要对与颜色空间有关的几个问题进行一番讲解。
Color Spaces
颜色空间
There are many different ways of representing color, and each color system is suited for different purposes. The most fundamental representation is RGB color space.
颜色的表达有多种不同的方式,每一种颜色系统所适合的用途都各不相同。最基本的一种表达方式为RGB颜色空间。
RGB stands for “Red-Green-Blue,” and it is a color system commonly employed in camera sensors and computer graphics displays. As the three primary colors that sum to form white light, they can combine in proportion to create most any color in the visible spectrum. RGB is the basis for all other color spaces, and it is the overwhelming choice of color space for computer graphics.
RGB代表“红-绿-蓝,”它是相机传感器和计算机图形显示方面常用的一种颜色系统。由于这三种原色相加起来可以形成白光,故可以通过将各原色按不同比例进行调和的办法来形成可见光谱区的大多数颜色。RGB是所有其他颜色空间的基础,在计算机图形学中,它是颜色空间的首选。
Gamma Correction
Gamma校正
“Gamma” is a crucial phenomenon to understand when dealing with color spaces. This term describes the nonlinear nature of luminance perception and display. Note that this is a twofold manifestation: the human eye perceives brightness in a nonlinear manner, and physical output devices (such as CRTs and LCDs) display brightness nonlinearly. It turns out, by way of coincidence, that human perception of luminance sensitivity is almost exactly the inverse of a CRT’s output characteristics.
在处理与颜色空间有关的问题时,“Gamma(g)”是一种需要弄懂的关键现象。该术语描述了人们对亮度的感受和显示本身存在的非线性。请注意,这种现象表现在两方面:人眼对亮度的感受是非线性的,而物理输出设备(例如CRT和LCD)对亮度的显示也是非线性的。人们发现,可谓巧合的是,人的视觉对亮度的灵敏度特性几乎恰好与CRT的输出特性相反。
Stated another way, luminance on a display is roughly proportional to the input analog signal voltage raised to the power of gamma. On a CRT or LCD display, this value is ordinarily between 2.2 and 2.5. A camera’s precompensation, then, scales the RGB values to the power of (1/gamma).
换句话说,显示器的亮度大约与输入的模拟信号电压的g次方成正比。在CRT或者LCD显示器上,该值一般为2.2~2.5。因此,相机的预补偿功能,是让RGB的量值按照1/g次方的关系来变化。
The upshot of this effect is that video cameras and computer graphics routines, through a process called “gamma correction,” prewarp their RGB output stream both to compensate for the target display’s nonlinearity and to create a realistic model of how the eye actually views the scene. Figure 1 illustrates this process.
该效应所带来的影响是,视频摄像机和计算机图形学程序,通过一种被称为“Gamma校正”的流程,可以预先对其RGB输出流进行预校正,以便补偿所针对的显示器的非线性,并就眼睛实际感受场景的方式形成一种有现实意义的模型。图1示出了这样一种流程。
Gamma-corrected RGB coordinates are referred to as R’G’B’ space, and the luma value Y’ is derived from these coordinates. Strictly speaking, the term “luma” should only refer to this gamma-corrected luminance value, whereas the true “luminance” Y is a color science term formed from a weighted sum of R, G, and B (with no gamma correction applied).
经过Gamma校正后的RGB坐标被称为R’G’B 空间,其中亮度值Y可以从这些座标中提取出来。严格来讲,“Luma”一词应该仅指这类经过“Gamma校正”的亮度值,而真正的“亮度(luminance)”Y是一个颜色科学方面的术语,它是从R、G和B的加权和(未经过Gamma校正)所获得的。
Often when we talk about YCbCr and RGB color spaces in this series, we are referring to gamma-corrected components – in other words, Y’CbCr or R’G’B’. However, because this notation can be distracting and doesn’t affect the substance of our discussion, and since it’s clear that gamma correction needs to take place at sensor and/or display interfaces to a processor, we will confine ourselves to the YCbCr/RGB nomenclature even in cases where gamma adjustment has been applied. The exception to this convention is when we discuss actual color space conversion equations.
在本系列文章中,当我们谈论YCbCr和RGB颜色空间时,我们是指经过Gamma校正的分量,换句话说,Y’CbCr或者R’G’B’。不过,因为该表示方法会造成我们的困惑,而且并不影响我们的讨论,既然Gamma校正必须在传感器和/或显示器与处理器的接口上执行,我们就将仅限于采用YCbCr/RGB的说法,即便在完成Gamma校正之后也是如此。这一约定的一个例外,是对实际的颜色空间变换方程进行讨论的时候。
Figure 1: Gamma correction linearizes the intensity produced for a given input amplitude
图1. Gamma 校正可实现针对一个给定的输入幅值所产生的光强信号进行线性化
图中:Linear input——线性输入;Display causes nonlinear output intensity——显示器造成了输出强度的非线性;Linear input——线性输入;Gamma Correction——Gamma校正;Linear output intensity on display——显示器上线性化的输出强度特性。
While RGB channel format is a natural scheme for representing real-world color, each of the three channels is highly correlated with the other two. You can see this by independently viewing the R, G, and B channels of a given image – you’ll be able to perceive the entire image in each channel. Also, RGB is not a preferred choice for image processing because changes to one channel must be performed in the other two channels as well, and each channel has equivalent bandwidth.
虽然RGB通道格式是呈现现实世界颜色的一种自然而然的方案,但3个通道中的每一个都与另外两个高度相关。你独立观看一幅特定图像的R、G和B通道,就可以发现这一点——你在每个通道中都能感受到整幅图像。另外,RGB并非图像处理的最佳选择,因为如果要变动一个通道,则也必须在另外两个通道进行更改,而且每个通道的带宽相同。
To reduce required transmission bandwidths and increase video compression ratios, other color spaces were devised that are highly uncorrelated, thus providing better compression characteristics than RGB does. The most popular ones – YPbPr, YCbCr, and YUV -- all separate a luminance component from two chrominance components. This separation is performed via scaled color difference factors (B’-Y’) and (R’-Y’). The Pb/Cb/U term corresponds to the (B’-Y’) factor, and the Pr/Cr/V term corresponds to the (R’-Y’) parameter. YPbPr is used in component analog video, YUV applies to composite NTSC and PAL systems, and YCbCr relates to component digital video.
为了减少所需要的传输带宽并提高视频压缩比,人们提出了其他的颜色空间方案,这些变量是高度非相关的,从而能提供优于RGB的压缩特性。其中最流行的一些方案——YPbPr、YCbCr和YUV——全都是将亮度信号分量与两个色度分量分离开。这种分离运算是借助等比例缩放的色差因子(B’-Y’)与(R’-Y’)来实施的。Pb/Cb/U等项对应着(B’-Y’)因子,Pr/Cr/V等项对应于(R’-Y’)参数。YPbPr用于分量化的模拟视频中,YUV则适用于复合的NTSC和PAL系统,YCbCr则与分量化数字视频有关。
Separating luminance and chrominance information saves image processing bandwidth. Also, as we’ll see shortly, we can reduce chrominance bandwidth considerably via subsampling, without much loss in visual perception. This is a welcome feature for video-intensive systems.
亮度和色度信息的分离,可以节省图像处理带宽。另外,正如我们马上就会看到的那样,我们可以通过子采样的方法,在视觉效果不会出现较大损失的前提下,大大减小色度信号带宽。这对于需大量处理视频数据的系统来说,是一个受欢迎的特色。
As an example of how to convert between color spaces, the following equations illustrate translation between 8-bit representations of Y’CbCr and R’G’B’ color spaces, where Y’, R’, G’ and B’ normally range from 16-235, and Cr and Cb range from 16-240.
作为在两个颜色空间之间进行转换的一个实例,如下的方程示出了如何在Y’CbCr和R’G’B’颜色空间的8bit表示法之间进行转换的方法,式中,Y’,R’,G’ 和 B’通常的变化范围是16~235,而Cr 和Cb的变化范围是16~240。
Y’ = (0.299)R´ + (0.587)G´ + (0.114)B´
Cb = -(0.168)R´ - (0.330)G´ + (0.498)B´ + 128
Cr = (0.498)R´ - (0.417)G´ - (0.081)B´ + 128
R´ = Y’ + 1.397(Cr - 128)
G´ = Y’ - 0.711(Cr - 128) - 0.343(Cb - 128)
B´ = Y’ + 1.765(Cb - 128)
Chroma subsampling
色度子采样
With many more rods than cones, the human eye is more attuned to brightness and less to color differences. As luck (or really, design) would have it, the YCbCr color system allows us to pay more attention to Y, and less to Cb and Cr. As a result, by subsampling these chroma values, video standards and compression algorithms can achieve large savings in video bandwidth.
由于人眼的杆状细胞要多于锥状细胞,故对于亮度的敏感能力要优于对色差的敏感能力。幸运的是(或者事实上通过设计可实现的是),YCbCr颜色系统允许我们将更多的注意力投向Y,而对Cb和Cr的关注程度不那么高。于是,通过对这些色度值进行子采样的方法,视频标准和压缩算法可以大幅度缩减视频带宽。
Before discussing this further, let’s get some nomenclature straight. Before subsampling, let’s assume we have a full-bandwidth YCbCr stream. That is, a video source generates a stream of pixel components in the form of Figure 2a. This is called “4:4:4 YCbCr.” This notation looks rather odd, but the simple explanation is this: the first number is always ‘4’, corresponding historically to the ratio between the luma sampling frequency and the NTSC color subcarrier frequency. The second number corresponds to the ratio between luma and chroma within a given line (horizontally): if there’s no downsampling of chroma with respect to luma, this number is ‘4.’ The third number, if it’s the same as the second digit, implies no vertical subsampling of chroma. On the other hand, if it’s a 0, there is a 2:1 chroma subsampling between lines. Therefore, 4:4:4 implies that each pixel on every line has its own unique Y, Cr and Cb components.
在对该问题进行进一步讨论之前,让我们先解析一些术语。假设在进行子采样前,我们拥有一路全带宽的YCbCr流,即视频源产生了一路像素分量构成的数据流,其形式如图2a所示。这被称为“4:4:4 YCbCr”。这一表示方法看起来甚为离奇,但是可以简单的解释如下:第一个数始终是‘4’,在历史上对应着亮度采样频率与NTSC彩色子载波频率之比。第二个数字对应着在某条给定的水平线上的亮度和色度之比:如果并未针对亮度信号来对色度进行下采样,则该数为‘4’。第三个数,如果与第二个数相同的话,则意味着没有垂直方向上的色度子采样。另一方面,如果它等于0,则两条线之间存在一次2:1的色度子采样。于是,4:4:4意味着在每条线上的像素都有自己独有的Y、Cr和Cb信号分量。
Now, if we filter a 4:4:4 YCbCr signal by subsampling the chroma by a factor of 2 horizontally, we end up with 4:2:2 YCbCr. ‘4:2:2’ implies that there are 4 luma values for every 2 chroma values on a given video line. Each (Y,Cb) or (Y,Cr) pair represents one pixel value. Another way to say this is that a chroma pair coincides spatially with every other luma value, as shown in Figure 2b. Believe it or not, 4:2:2 YCbCr qualitatively shows little loss in image quality compared with its 4:4:4 YCbCr source, even though it represents a savings of 33% in bandwidth over 4:4:4 YCbCr. As we’ll discuss soon, 4:2:2 YCbCr is a foundation for the ITU-R BT.601 video recommendation, and it is the most common format for transferring digital video between subsystem components.
现在,如果我们通过在水平方向上对色度信号进行因数为2的子采样,对一个4:4:4 YCbCr信号进行滤波,于是将得到4:2:2 YCbCr。‘4:2:2’意味着在给定的视频行上对应2个色度值有4个亮度值。每个(Y,Cb) 或者 (Y,Cr)对都代表着一个像素值。另一种表述方法是:一个色度信号对(chroma pair)在空间上与每隔一个亮度所选取的亮度值相重合,如图2b所示。无论您相信与否, 若将4:2:2 YCbCr与对应 的4:4:4 YCbCr 原始信号进行比较,则会发现在图像质量方面出现的损失很小,虽然它相对于4:4:4 YCbCr而言在带宽方面缩小了33%。正如我们马上要讨论的那样,4:2:2 YCbCr 是ITU-R BT.601视频推荐方案的一个基础,它是数字视频信号在不同的子系统组件之间进行传输时最常用的一种格式。
Figure 2: a) 4:4:4 vs. b) 4:2:2 YCbCr pixel sampling
图2: a) 4:4:4与 b) 4:2:2 YCbCr像素采样的比较
图中:Sampling——采样,3bytes per pixel——每像素3个字节,2bytes per pixel——每个像素2个字节。
Note that 4:2:2 is not the only chroma subsampling scheme. Figure 3 shows others in popular use. For instance, we could subsample the chroma of a 4:4:4 YCbCr stream by a factor of 4 horizontally, as shown in Figure 3c, to end up with a 4:1:1 YCbCr stream. Here, the chroma pairs are spatially coincident with every fourth luma value. This chroma filtering scheme results in a 50% bandwidth savings. 4:1:1 YCbCr is a popular format for inputs to video compression algorithms and outputs from video decompression algorithms.
应注意的是,4:2:2并非是唯一的色度子采样方案。图3示出了得到广泛应用的其他一些方案。例如,我们可以如图3c所示的那样,以比例因数“4”来对一个4:4:4 YCbCr流的水平行上的色度进行子采样,从而得到4:1:1 YCbCr流。这里,色度对在空间上与每个以4 为间隔的亮度值相一致。这一色度过滤方案可以节省50%的带宽。4:1:1 YCbCr是一种在视频压缩算法的输入和视频解压缩算法的输出方面广受欢迎的格式。
Another format popular in video compression/uncompression is 4:2:0 YCbCr, and it’s more complex than the others we’ve described for a couple of reasons. For one, the Cb and Cr components are each subsampled by 2 horizontally and vertically. This means we have to store multiple video lines in order to generate this subsampled stream. What’s more, there are 2 popular formats for 4:2:0 YCbCr. MPEG-2 compression uses a horizontally co-located scheme (Figure 3d, top), whereas MPEG-1 and JPEG algorithms use a form where the chroma are centered between Y samples (Figure 3d, bottom).
另外一种在视频压缩/解压缩方面得到广泛应用的算法是4:2:0 YCbCr,由于几个原因,它要比我们已经阐述过的其他方案更为复杂。其中一个原因是,Cb和Cr分量均在水平和垂直方向上受到了因数为2的子采样。这意味着我们不得不存储多条视频线,才能产生出该子采样流。而且,对应4:2:0 YCbCr信号还存在两种广受欢迎的格式。MPEG-2压缩使用了一种水平方向上共位的方案(图3d,顶图),而MPEG-1和JPEG算法所采用的格式则是一种色度信号位于Y采样之间的中点上的方案(图3d,底图)。
Figure 3: (a) YCbCr 4:4:4 stream and its chroma-subsampled derivatives (b) 4:2:2 (c) 4:1:1 (d) 4:2:0
图3:(a) YCbCr 4:4:4流以及经过色度子采样后的导出信号;(b)4:2:2;(c)4:1:1; (d) 4:2:0
图中:Luma——亮度,Chroma——色度,Total bytes——总字节数;
Luma +Chroma Component——亮度+色度分量,Cosited——共位的;Interstitial——间插的。
左下脚说明文字:为了清楚起见,描述了逐行扫描,亮度/色度分量都具有1个字节的长度;分量的标记对应着串序化的次序,而并非空间位置。
Digital Video
数字视频
Before the mid-1990’s, nearly all video was in analog form. Only then did forces like the advent of MPEG-2 compression, proliferation of streaming media on the Internet, and the FCC’s adoption of a Digital Television (DTV) Standard create a “perfect storm” that brought the benefits of digital representation into the video world. These advantages over analog include better signal-to-noise performance, improved bandwidth utilization (fitting several digital video channels into each existing analog channel), and reduction in storage space through digital compression techniques.
在上世纪90年代中期以前,几乎所有的视频都采用了模拟格式。正是在那之后,诸如MPEG-2压缩的出现、互联网上流媒体的繁荣以及FCC采用“数字电视(DTV)”标准等推动力形成了一场“完美风暴”,将数字视频表现方式的优点呈现给整个视频世界。这些超越模拟信号的优点包括:更好的信噪比性能,更好的带宽利用(可以将若干路数字视频通道融入每个现有的视频频道中),而且通过数字压缩技术来减少存储空间。
At its root, digitizing video involves both sampling and quantizing the analog video signal. In the 2D context of a video frame, sampling entails dividing the image space, gridlike, into small regions and assigning relative amplitude values based on the intensities of color space components in each region. Note that analog video is already sampled vertically (discrete number of rows) and temporally (discrete number of frames per second).
从根本上来说,视频的数字化同时包括了对模拟视频信号的采样和量化。在视频帧的2D框架之中,采样最终体现为将栅格状的图像空间划分为小块的区域,并根据每个区域中颜色空间分量的强度来为其分配相对幅值。请注意,模拟视频信号已经被从垂直方向上(离散的多行扫描)和时间上(每秒分立的多帧图像)做了采样。
Quantization is the process that determines these discrete amplitude values assigned during the sampling process. 8-bit video is common in consumer applications, where a value of 0 is darkest (total black) and 255 is brightest (white),for each color channel (R,G,B or YCbCr). However, it should be noted that 10-bit and 12-bit quantization per color channel is rapidly entering mainstream video products, allowing extra precision that can be useful in reducing received image noise by avoiding roundoff error.
量化是指在采样过程中设法确定这些离散幅值的过程。8bit视频是消费类应用中常用的格式,在每个颜色通道(R、G、B或者 YCbCr)中,0代表最暗(全黑),255代表最亮(白)。不过,应该指出的是,每单色通道10bit 和12bit的量化水平,也正在迅速融入主流的视频产品中,从而带来更高的精度,这对于避免其截断误差从而降低接收到的图像噪声来说,是非常有效的。
The advent of digital video provided an excellent opportunity to standardize, to a large degree, the interfaces to NTSC and PAL systems. When the ITU (International Telecommunication Union) met to define recommendations for digital video standards, it focused on achieving a large degree of commonality between NTSC and PAL formats, such that the two could share the same coding formats.
数字视频的出现,在很大程度上为NTSC和PAL系统接口的标准化提供了极佳的机会。当ITU(国际电信联盟)开会以确定关于数字视频标准方面的推荐方案时,它把重点放在如何在NTSC和PAL格式之间实现高度的共享性,这样使得两种标准都可以采用同一种编码格式。
They defined 2 separate recommendations – ITU-R BT.601 and ITU-R BT.656. Together, these two define a structure that enables different digital video system components to interoperate. Whereas BT.601 defines the parameters for digital video transfer, BT.656 defines the interface itself.
他们确定了2种独立的推荐方案-ITU-R BT.601和ITU-R BT656。这两种方案合起来,就定义了一种结构,该结构能让不同的数字视频系统部件实现互操作。鉴于BT.601定义了数字视频传输所用的参数,BT.656定义了接口本身。
ITU-R BT.601 (formerly CCIR-601)
ITU-R BT.601(前 CCIR-601)
BT.601 specifies methods for digitally coding video signals, using the YCbCr color space for better use of channel bandwidth. It proposes 4:2:2 YCbCr as a preferred format for broadcast video. Synchronization signals (HSYNC, VSYNC, FIELD) and a clock are also provided to delineate the boundaries of active video regions. Figure 4 shows typical timing relationships between sync signals, clock and data.
BT.601规定了对视频信号进行数字化编码的方法,它利用了YCbCr颜色空间,以更好地利用通道带宽。它建议将4:2:2 YCbCr作为广播视频的首选格式。同时也提供了同步信号(HSYNC,VSYNC,FIELD)和时钟信号,以便划定有效视频区的边界。图4示出了同步信号、时钟和数字信号之间典型的时序关系。
Figure 4: Common Digital Video Format Timing
图4 常见的数字视频格式的时序关系
Video Data——视频数据,Clock——时钟信号,format——格式,Digital RGB format——数字RGB格式。Video Data——视频数据,Clock——时钟。
Each BT.601 pixel component (Y, Cr, or Cb) is quantized to either 8 or 10 bits, and both NTSC and PAL have 720 pixels of active video per line. However, they differ in their vertical resolution. While 30 frames/sec NTSC has 525 lines (including vertical blanking, or retrace, regions), the 25 frame/sec rate of PAL is accommodated by adding 100 extra lines, or 625 total, to the PAL frame.
每个BT.601像素分量(Y、Cr或 Cb)被量化为8或者10bit信息,NTSC和PAL的有效视频画面中每行有720个像素 。不过,它们在垂直分辨率方面存在差异。30帧/s的NTSC有525行(线)(包括垂直消隐或者回扫区),而通过在PAL帧上添加100条线,或者说,使之总共达到625条线,使PAL的25帧/s的速率适应同样的标准。
BT.601 specifies Y with a nominal range from 16 (total black) to 235 (total white). The color components Cb and Cr span from 16 to 240, but a value of 128 corresponds to no color. Sometimes, due to noise or rounding errors, a value might dip outside the nominal boundaries, but never all the way to 0 or 255.
BT.601规定Y值的额定值范围从16(全黑)一直到235(全白)。颜色分量Cb和Cr则从16到240,但128的量值对应着无颜色。有时,由于噪声或者截断误差的存在,一个量值可能会超出额定值边界之外,但永远不会取值到0或者255。
ITU-R BT.656 (formerly CCIR-656)
ITU-R BT.656 (前 CCIR-656)
Whereas BT.601 outlines how to digitally encode video, BT.656 actually defines the physical interfaces and data streams necessary to implement BT.601. It defines both bit-parallel and bit-serial modes. The bit-parallel mode requires only a 27 MHz clock (for NTSC 30 frames/sec) and 8 or 10 data lines (depending on the pixel resolution). All synchronization signals are embedded in the data stream, so no extra hardware lines are required.
BT.601规划了对视频进行数字编码的方法,而BT.656则实际定义了实施BT.601所必需的物理接口和数据流。它同时定义了位并行和位串行模式。位并行模式只需要27MHz的时钟(在NTSC 30 帧/s条件下)以及8或10条连线(具体取决于像素的分辨率)。所有的同步化信号都嵌入到数据流中,因此无需额外添加硬件连线。
The bit-serial mode requires only a multiplexed 10 bit/pixel serial data stream over a single channel, but it involves complex synchronization, spectral shaping and clock recovery conditioning. Furthermore, the bit clock rate runs close to 300 MHz, so it can be challenging to implement bit-serial BT.656 in many systems. For our purposes, we’ll focus our attention on the bit-parallel mode only.
位串行模式只需要在单个通道上传输一路复用化的10bit/像素串行数据流,不过它需要运用复杂的同步化、频谱整形和时钟恢复调理等技术手段。此外,其位时钟速率接近300MHz,因此要在很多系统中实施基于采用串行位形式的BT.656是极富挑战性的任务。从我们的目标出发,我们将把注意力仅放在位并行模式上。
The frame partitioning and data stream characteristics of ITU-R BT.656 are shown in Figures 5 and 6, respectively, for 525/60 (NTSC) and 625/50 (PAL) systems.
图5和图6示出了ITU-R BT. 656中分别针对525/60 (NTSC)和625/50 (PAL)系统的帧划分方法和数据流的特性
Figure 5: ITU-R BT.656 Frame Partitioning
图5 ITU-R BT.656 帧划分
图中:horizontal blanking——水平消隐,Vertical Blancking——垂直消隐,Field——场,Active Video——有效的视频,
Figure 6: ITU-R BT.656 Data Stream
图6: ITU-R BT.656数据流
End of active video——有效视频终点;horizontal blanking——水平消隐,Start of active video——有效视频起点,Start of next line——下一行的起点。Digital Video Stream——数字视频流。Control Byte——控制字节。
In BT.656, the Horizontal (H), Vertical (V), and Field (F) signals are sent as an embedded part of the video data stream in a series of bytes that form a control word. The Start of Active Video (SAV) and End of Active Video (EAV) signals indicate the beginning and end of data elements to read in on each line. SAV occurs on a 1-to-0 transition of H, and EAV begins on a 0-to-1 transition of H. An entire field of video is comprised of Active Video + Horizontal Blanking (the space between an EAV and SAV code) and Vertical Blanking (the space where V = 1).
在BT.656标准中,水平(H)、垂直(V)和场(F)信号作为嵌入到视频数据流中的一串字节来发送,这一串字节构成了一个控制字。有效视频起点(SAV)和有效视频终点(EAV)信号指示了每行读入的数据单元的开始和结束。SAV出现在H发生1-0切换时,EAV出现在H发生0-1切换时。整个视频场由有效的视频+水平消隐(EAV和SAV代码之间的空间)以及垂直消隐(V=1的空间)组成。
A field of video commences on a transition of the F bit. The “odd field” is denoted by a value of F = 0, whereas F = 1 denotes an even field. Progressive video makes no distinction between Field 1 and Field 2, whereas interlaced video requires each field to be handled uniquely, because alternate rows of each field combine to create the actual video image.
视频的场从F位的切换开始。“奇数场”由F=0表示,而F=1则表示偶数场。逐行扫描的视频并不区分场1和场2,而隔行扫描的视频则要求专门对每个场进行独立的处理,因为每个场交替的扫描行组合起来最终形成实际的视频图像。
The SAV and EAV codes are shown in more detail in Figure 7. Note there is a defined preamble of three bytes (0xFF, 0x00, 0x00 for 8-bit video, or 0x3FF, 0x000, 0x000 for 10-bit video), followed by the XY Status word, which, aside from the F (Field), V (Vertical Blanking) and H (Horizontal Blanking) bits, contains four protection bits for single-bit error detection and correction. Note that F and V are only allowed to change as part of EAV sequences (that is, transitions from H = 0 to H = 1). Also, notice that for 10-bit video, the two additional bits are actually the least-significant bits, not the most-significant bits.
图7更为详细地示出了SAV和EAV代码。请注意,视频数据有一个由三个字节构成的前导码(8bit视频是0xFF, 0x00,0x00,而10bit视频则是0x3FF, 0x000, 0x000),后面跟随着XY状态字,这个字除了包含F (场), V (垂直消隐) 和 H (水平消隐)位之外,还包含了4个保护位,以实现单位错误的检测和纠正。请注意,F和V只能作为EAV序列的一部分来变化(即,从H = 0切换到H = 1)。此外,请注意,对于10bit视频来说,增加的两位实际上是最低位,而不是最高位。
Figure 7: SAV/EAV Preamble codes
图7 SAV/EAV 前导码
图中:8-bit Data——8位数据,10bit Data——10位数据
The bit definitions are as follows: 各个数位的定义如下:
• F = 0 for Field 1
• F = 0 ,场 1
• F = 1 for Field 2
• F = 1, 场 2
• V = 1 during Vertical Blanking
• V = 1 垂直消隐期间
• V = 0 when not in Vertical Blanking
• V = 0 未在垂直消隐期内
• H = 0 at SAV
• H = 0 @ SAV
• H = 1 at EAV
• H = 1 @ EAV
• P3 = V XOR H
• P2 = F XOR H
• P1 = F XOR V
• P0 = F XOR V XOR H
The vertical blanking interval (the time during which V=1) can be used to send non-video information, like audio, teletext, closed-captioning, or even data for interactive television applications. BT.656 accommodates this functionality through the use of ancillary data packets. Instead of the “0xFF, 0x00, 0x00” preamble that normally precedes control codes, the ancillary data packets all begin with a “0x00, 0xFF, 0xFF” preamble.
垂直消隐间隔(V=1的时间)可以被用来发送非视频的信息,如音频、文字电视广播(teletext)、字幕(closed-captioning)或者甚至交互电视应用所需的数据。BT.656借助辅助性的数据包实现这些功能。这些辅助性数据包并未采用通常在控制代码前的那些“0xFF, 0x00, 0x00”前导码,而是以“0x00, 0xFF, 0xFF”前导码为开头。
Assuming that ancillary data is not being sent, during horizontal and vertical blanking intervals the (Cb, Y, Cr, Y, Cb, Y, …) stream is (0x80, 0x10, 0x80, 0x10, 0x80, 0x10…). Also, note that because the values 0x00 and 0xFF hold special value as control preamble demarcators, they are not allowed as part of the active video stream. In 10-bit systems, the values (0x000 through 0x003) and (0x3FC through 0x3FF) are also reserved, so as not to cause problems in 8-bit implementations.
假定并不发送辅助数据,则在水平和垂直消隐间隔期间,(Cb, Y, Cr, Y, Cb, Y, …)流是(0x80, 0x10, 0x80, 0x10, 0x80, 0x10…)。另外,请注意,由于0x00 和 0xFF等量值专门用于控制前导码,故不能用作有效视频流的一部分。在10bit系统中,(0x000 到 0x003)以及(0x3FC 到0x3FF)等量值也都被专门留出,以免给8bit的视频引用造成问题。
So that’s a wrap on our discussion of digital video concepts. In the next series installment, we’ll turn our focus to a systems view of video, covering how video streams enter and exit embedded systems.
上述就是我们关于数字视频的概念介绍。在下一部分中,我们将把注意力放在如何从系统角度来考察视频技术上,讨论视频流是如何进入和输出嵌入式系统的。
3454

被折叠的 条评论
为什么被折叠?



