压缩纹理相关

最新推荐文章于 2024-11-19 01:24:31 发布

原创最新推荐文章于 2024-11-19 01:24:31 发布 · 880 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#textures #layout #direct3d #table #存储 #colors

游戏开发专栏收录该内容

36 篇文章

订阅专栏

本文深入探讨了Direct3D (D3D) 中纹理压缩技术的原理和应用，包括存储效率的提升、不同压缩格式的特性、以及如何在纹理中融合格式。重点介绍了DXT1、DXT2-5压缩格式，以及如何在包含Alpha通道的纹理中进行更复杂的透明度编码。此外，还提供了获取压缩纹理pitch和大小的函数，帮助开发者高效地处理和优化纹理资源。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

压缩纹理相关

纹理在光栅化阶段被映射到模型上，在这个过程中会消耗掉大量的系统总线和内存。为了减少纹理占用的内存的数量，d3d支持纹理压缩技术。一些设备与生俱来拥有纹理压缩支持。在这些设备上，可以创建压缩表面并加载数据到该表面上，该表面和其他的非压缩的表面一样使用。d3d在把纹理映射到模型上时进行解压缩。

存储效率和纹理压缩

所有的纹理压缩格式大小必须是2的指数，但是并不要求该表面必须是方形的，也可以使矩形的。压缩纹理的最小块一般是4*4大小的纹理单元，所以在比较低的纹理层，可能会有一些字节是浪费掉的。但是在整体上来说压缩的意义还是蛮大的。

在一个纹理中的格式混合

在一个压缩纹理中，数据存储的最小单位是4*4纹元，即16个纹元组成的块，每个块的大小是64位或128位。如果块大小是64位则该纹理使用的是DXT1压缩格式，如果是128位大小则该纹理使用的是DXT2-5. 对于DXT1格式，在同一个纹理内部，在一个block上，混合了不透明格式和1位alpha格式。对128位的block，alpha通道必须被指定，通过显示的方式(DXT2和DXT3)或差值模式(DXT4和DXT5)。如果差值模式被选择，又有两种差值方式可以被选择，一个使用8个alpha差值，一个使用6个alpha差值。具体使用哪种差值取决于alpha_0与alpha_1的比较结果。

下面具体讲一下各种压缩格式：

Opaque and 1-Bit Alpha Textures (Direct3D 9)

DXT1格式的纹理拥有不透明的颜色或只有1bit的透明色。对每个不透明的block或只有一个透明位的block，都是由2个16-bit的值(RGB 5:6:5)和一个4*4的位图(每个像素两个位)来存储的。整个块有16个纹元，共64位，可以认为是一个纹元占4位。在block的位图中，每个纹元有两位，这个两位的编码决定了该纹元可能拥有4种颜色。其中color_0和color_1存储在上面提到的两个16-bit的空间中，而color_2和color_3通过差值得到。

拥有1bit的alpha的格式与不透明的格式的区别在color_0和color_1的比较结果，如果后者大，则是拥有1bit的alpha的格式，反之是不透明的格式。

对于不透明的格式，有4中颜色用来表示纹元，这种格式与RGB 5:6:5比较相似。而对于1bit的透明格式，只有3颜色被用来表示纹元，第四种被用来表示透明的纹元。对于1bit的透明格式，与RGBA5:5:5:1比较相似。最后一位被用作alpha掩码。下面的代码例证了上面提到的算法：

if (color_0 > color_1) 
{
    // Four-color block: derive the other two colors.    
    // 00 = color_0, 01 = color_1, 10 = color_2, 11 = color_3
    // These 2-bit codes correspond to the 2-bit fields 
    // stored in the 64-bit block.
	color_2 = (2 * color_0 + color_1 + 1) / 3;
	color_3 = (color_0 + 2 * color_1 + 1) / 3;
}    
else
{ 
    // Three-color block: derive the other color.
    // 00 = color_0,  01 = color_1,  10 = color_2,  
    // 11 = transparent.
    // These 2-bit codes correspond to the 2-bit fields 
    // stored in the 64-bit block. 
	color_2 = (color_0 + color_1) / 2;    
	color_3 = transparent;    

}

           建议在混合之前，把透明的纹元的RGBA分量都置为0.

           下面的表格显示了8个字节的Block的内存分布。假设第一个索引对应y轴，第二个索引对应x轴。例如，Texelp[1][2]对应纹元坐标为(x,y)=(2,1).

           This table contains the memory layout for the 8-byte (64-bit) block.
Word address 16-bit word
0 Color_0
1 Color_1
2 Bitmap Word_0
3 Bitmap Word_1
Color_0 and Color_1, the colors at the two extremes, are laid out as follows:
Bits Color
4:0 (LSB*) Blue color component
10:5 Green color component
15:11 Red color component
*least-significant bit
Bitmap Word_0 is laid out as follows: 
Bits Texel
1:0 (LSB) Texel[0][0]
3:2 Texel[0][1]
5:4 Texel[0][2]
7:6 Texel[0][3]
9:8 Texel[1][0]
11:10 Texel[1][1]
13:12 Texel[1][2]
15:14 (MSB*) Texel[1][3]
*most significant bit (MSB)
Bitmap Word_1 is laid out as follows:
Bits Texel
1:0 (LSB) Texel[2][0]
3:2 Texel[2][1]
5:4 Texel[2][2]
7:6 Texel[2][3]
9:8 Texel[3][0]
11:10 Texel[3][1]
13:12 Texel[3][2]
15:14 (MSB) Texel[3][3]

Word address	16-bit word
0	Color_0
1	Color_1
2	Bitmap Word_0
3	Bitmap Word_1

Bits	Color
4:0 (LSB*)	Blue color component
10:5	Green color component
15:11	Red color component

Bits	Texel
1:0 (LSB)	Texel[0][0]
3:2	Texel[0][1]
5:4	Texel[0][2]
7:6	Texel[0][3]
9:8	Texel[1][0]
11:10	Texel[1][1]
13:12	Texel[1][2]
15:14 (MSB*)	Texel[1][3]

Bits	Texel
1:0 (LSB)	Texel[2][0]
3:2	Texel[2][1]
5:4	Texel[2][2]
7:6	Texel[2][3]
9:8	Texel[3][0]
11:10	Texel[3][1]
13:12	Texel[3][2]
15:14 (MSB)	Texel[3][3]

	Textures with Alpha Channels(Direct3D 9)

	有两种方式来编码纹理图，来表示更为复杂的透明。在每中方法中，一个新的64-bit的块被在上文中提到的块之前来定义。这个块被表示为4*4的位图(每个纹元4位)显示的编码，或更少的位结合线性插值来表示。

                透明的块和前面提到的颜色块在内存中的组织结构如下：

                Word address 64-bit block
3:0 Transparency block
7:4 Previously described 64-bit block

Word address	64-bit block
3:0	Transparency block
7:4	Previously described 64-bit block

   下面先来介绍显示的纹理编码：

               为了显示的编码(DXT2和DXT3格式)，纹元的aplha通道被描述成透明色被编码在4*4的位图中，每个纹元4bit。

The following tables illustrate how the alpha information is laid out in memory, for each 16-bit word.
This table contains the layout for word 0.
Bits Alpha
3:0 (LSB*) [0][0]
7:4 [0][1]
11:8 [0][2]
15:12 (MSB*) [0][3]
*least-significant bit, most significant bit (MSB)
This table contains the layout for word 1.
Bits Alpha
3:0 (LSB) [1][0]
7:4 [1][1]
11:8 [1][2]
15:12 (MSB) [1][3]
This table contains the layout for word 2.
Bits Alpha
3:0 (LSB) [2][0]
7:4 [2][1]
11:8 [2][2]
15:12 (MSB) [2][3]
This table contains the layout for word 3.
Bits Alpha
3:0 (LSB) [3][0]
7:4 [3][1]
11:8 [3][2]
15:12 (MSB) [3][3]

Bits	Alpha
3:0 (LSB*)	[0][0]
7:4	[0][1]
11:8	[0][2]
15:12 (MSB*)	[0][3]

Bits	Alpha
3:0 (LSB)	[1][0]
7:4	[1][1]
11:8	[1][2]
15:12 (MSB)	[1][3]

Bits	Alpha
3:0 (LSB)	[2][0]
7:4	[2][1]
11:8	[2][2]
15:12 (MSB)	[2][3]

Bits	Alpha
3:0 (LSB)	[3][0]
7:4	[3][1]
11:8	[3][2]
15:12 (MSB)	[3][3]

DXT2和DXT3的区别在于，在DXT2中假设颜色数据已经预先乘上了alpha值。而在DXT3中则没有预先乘alpha值。这两种格式都是必须的，因为在使用纹理的时候仅仅根据数据是无法区分这两种情况的，所以采用了两种格式，即两种FOURCC编码。但是，这两种格式的数据和插值方法都是一样的。在DXT1中使用的颜色比较机制，在新的格式中已经不再使用。或者可以认为现在只有4中颜色的格式，即只有Color_0>Color_1的情况。

下面我们来介绍3bit的线性alpha插值算法：

其实就是DXT4和DXT5格式。在这两种格式中，有两个8位的alpha和一个4*4的位图(每个纹元3位)存储在第一个8字节的block中。如果alpha_0大于alpha_1则6个alpha差值会被创建，反之4个alpha差值会被创建，另外两个值分别是0和255，被用来表示完全透明和完全不透明。下面的算法例证了上面的论述：

// 8-alpha or 6-alpha block?    
if (alpha_0 > alpha_1) {    
    // 8-alpha block:  derive the other six alphas.    
    // Bit code 000 = alpha_0, 001 = alpha_1, others are interpolated.
    alpha_2 = (6 * alpha_0 + 1 * alpha_1 + 3) / 7;    // bit code 010
    alpha_3 = (5 * alpha_0 + 2 * alpha_1 + 3) / 7;    // bit code 011
    alpha_4 = (4 * alpha_0 + 3 * alpha_1 + 3) / 7;    // bit code 100
    alpha_5 = (3 * alpha_0 + 4 * alpha_1 + 3) / 7;    // bit code 101
    alpha_6 = (2 * alpha_0 + 5 * alpha_1 + 3) / 7;    // bit code 110
    alpha_7 = (1 * alpha_0 + 6 * alpha_1 + 3) / 7;    // bit code 111  
}    
else {  
    // 6-alpha block.    
    // Bit code 000 = alpha_0, 001 = alpha_1, others are interpolated.
    alpha_2 = (4 * alpha_0 + 1 * alpha_1 + 2) / 5;    // Bit code 010
    alpha_3 = (3 * alpha_0 + 2 * alpha_1 + 2) / 5;    // Bit code 011
    alpha_4 = (2 * alpha_0 + 3 * alpha_1 + 2) / 5;    // Bit code 100
    alpha_5 = (1 * alpha_0 + 4 * alpha_1 + 2) / 5;    // Bit code 101
    alpha_6 = 0;                                      // Bit code 110
    alpha_7 = 255;                                    // Bit code 111
}

The memory layout of the alpha block is as follows:

Byte	Alpha
0	Alpha_0
1	Alpha_1
2	[0][2] (2 MSBs), [0][1], [0][0]
3	[1][1] (1 MSB), [1][0], [0][3], [0][2] (1 LSB)
4	[1][3], [1][2], [1][1] (2 LSBs)
5	[2][2] (2 MSBs), [2][1], [2][0]
6	[3][1] (1 MSB), [3][0], [2][3], [2][2] (1 LSB)
7	[3][3], [3][2], [3][1] (2 LSBs)

Dxt4和Dxt5的区别也是在于是否预乘alpha值。再次不在赘述。

下面两个函数可以得到压缩文件的pitch和大小：

需要特别注意的是，对于压缩纹理，下面的第一个函数可以获取其pitch该pitch与用d3d创建纹理并loctrect获取的pitch的值是相同的。细心的你会发现，此时的pitch并不再是一行的字节数。所以，在提交纹理到显存时，不能用或最好不用此pitch。可以直接使用memcpy,大小为下面第二个函数获取的当前level的大小。

U32 DDSFile::getPitch( U32 mipLevel ) const
{
   if(mFlags.test(CompressedData))
   {
      U32 sizeMultiple = 0;

      switch(mFormat)
      {
      case GFXFormatDXT1:
         sizeMultiple = 8;
         break;
      case GFXFormatDXT2:
      case GFXFormatDXT3:
      case GFXFormatDXT4:
      case GFXFormatDXT5:
         sizeMultiple = 16;
         break;
      default:
         AssertISV(false, "DDSFile::getPitch - invalid compressed texture format, we only support DXT1-5 right now.");
         break;
      }

      // Maybe need to be DWORD aligned?
      U32 align = getMax(U32(1), getWidth(mipLevel)/4) * sizeMultiple;;
      align += 3; align >>=2; align <<=2;
      return align;

   }
   else
      return getWidth(mipLevel) * mBytesPerPixel;
}

U32 DDSFile::getSurfaceSize( U32 height, U32 width, U32 mipLevel ) const
{
   // Bump by the mip level.
   height = getMax(U32(1), height >> mipLevel);
   width = getMax(U32(1), width >> mipLevel);

   if(mFlags.test(CompressedData))
   {
      // From the directX docs:
      // max(1, width ?4) x max(1, height ?4) x 8(DXT1) or 16(DXT2-5)

U32 sizeMultiple = 0;

      switch(mFormat)
      {
      case GFXFormatDXT1:
         sizeMultiple = 8;
         break;
      case GFXFormatDXT2:
      case GFXFormatDXT3:
      case GFXFormatDXT4:
      case GFXFormatDXT5:
         sizeMultiple = 16;
         break;
      default:
         AssertISV(false, "DDSFile::getSurfaceSize - invalid compressed texture format, we only support DXT1-5 right now.");
         break;
      }

      return getMax(U32(1), width/4) * getMax(U32(1), height/4) * sizeMultiple;
   }
   else
   {
      return height * width* mBytesPerPixel;
   }
}

U32 DDSFile::getSizeInBytes() const
{
// TODO: This doesn't take mDepth into account, so
// it doesn't work right for volume textures!

   U32 bytes = 0;
   for ( U32 i=0; i < mMipMapCount; i++ )
      bytes += getSurfaceSize( mHeight, mWidth, i );

return bytes;
}