Stage3D Upload Speed Tester

本文通过实验对比了在Stage3D中不同数据类型从系统内存上传到显存的速度,包括纹理、顶点缓冲区及索引缓冲区等资源,并分析了软件渲染与硬件加速模式下的性能差异。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

from: http://jacksondunstan.com/articles/1617

 

Stage3D Upload Speed Tester

Since Flash Player 11′s new Stage3D allows us to utilize hardware-acceleration for 3D graphics, that entails a whole new set of performance we need to consider. Today’s article discusses the performance of uploading data from system memory (RAM) to video memory (VRAM), such as when you upload textures, vertex buffers, and index buffers. Is it faster to upload to one type rather than another? Is it faster to upload from a Vector, a ByteArray, or a BitmapData? Is there a significant speedup when using software rendering so that VRAM is the same as RAM? Find out the answers to all of these questions below.

 

 

 

The below performance test checks the upload speeds in both hardware and software mode of all of these types:

 

  • Texture from…
    • BitmapData
    • Vector
    • ByteArray
  • VertexBuffer3D from…
    • Vector
    • ByteArray
  • IndexBuffer3D from…
    • Vector
    • ByteArray

 

Check it out:

 

package
{
	import flash.display3D.*;
	import flash.display3D.textures.*;
	import flash.external.*;
	import flash.display.*;
	import flash.sampler.*;
	import flash.system.*;
	import flash.events.*;
	import flash.utils.*;
	import flash.text.*;
	import flash.geom.*;
 
	import com.adobe.utils.*;
 
	public class Stage3DUploadTester extends Sprite
	{
		private var __stage3D:Stage3D;
		private var __logger:TextField = new TextField();
		private var __context:Context3D;
		private var __driverInfo:String;
		private var __texture:Texture;
		private var __bmdNoAlpha:BitmapData;
		private var __bmdAlpha:BitmapData;
		private var __texBytes:ByteArray;
		private var __vertexBuffer:VertexBuffer3D;
		private var __vbVector:Vector.<Number>;
		private var __vbBytes:ByteArray;
		private var __indexBuffer:IndexBuffer3D;
		private var __ibVector:Vector.<uint>;
		private var __ibBytes:ByteArray;
 
		public function Stage3DUploadTester()
		{
			__stage3D = stage.stage3Ds[0];
 
			__logger.autoSize = TextFieldAutoSize.LEFT;
			addChild(__logger);
 
			// Allocate texture data
			__bmdNoAlpha = new BitmapData(2048, 2048, false, 0xffffffff);
			__bmdAlpha = new BitmapData(2048, 2048, true, 0xffffffff);
			__texBytes = new ByteArray();
			var size:int = __texBytes.length = 2048*2048*4;
			for (var i:int; i < size; ++i)
			{
				__texBytes[i] = 0xffffffff;
			}
 
			// Allocate vertex buffer data
			size = 65535*64;
			__vbVector = new Vector.<Number>(size);
			for (i = 0; i < size; ++i)
			{
				__vbVector[i] = 1.0;
			}
			__vbBytes = new ByteArray();
			__vbBytes.length = size*4;
			for (i = 0; i < size; ++i)
			{
				__vbBytes.writeFloat(1.0);
			}
			__vbBytes.position = 0;
 
			// Allocate index buffer data
			size = 524287;
			__ibVector = new Vector.<uint>(size);
			for (i = 0; i < size; ++i)
			{
				__ibVector[i] = 1.0;
			}
			__ibBytes = new ByteArray();
			__ibBytes.length = size*4;
			for (i = 0; i < size; ++i)
			{
				__ibBytes.writeFloat(1.0);
			}
			__ibBytes.position = 0;
 
			setupContext(Context3DRenderMode.AUTO);
		}
 
		private function setupContext(renderMode:String): void
		{
			__stage3D.addEventListener(Event.CONTEXT3D_CREATE, onContextCreated);
			__stage3D.requestContext3D(renderMode);
		}
 
		private function onContextCreated(ev:Event): void
		{
			__stage3D.removeEventListener(Event.CONTEXT3D_CREATE, onContextCreated);
 
			var first:Boolean = __logger.text.length == 0;
			if (first)
			{
				__logger.appendText("Driver,Test,Time,Bytes/Sec\n");
			}
 
			const width:int = stage.stageWidth;
			const height:int = stage.stageHeight;
 
			__context = __stage3D.context3D;
			__context.configureBackBuffer(width, height, 0, true);
			__driverInfo = __context.driverInfo;
			__texture = __context.createTexture(
				2048,
				2048,
				Context3DTextureFormat.BGRA,
				false
			);
			__vertexBuffer = __context.createVertexBuffer(65535, 64);
			__indexBuffer = __context.createIndexBuffer(524287);
 
			runTests();
 
			if (first)
			{
				__context.dispose();
				setupContext(Context3DRenderMode.SOFTWARE);
			}
		}
 
		private function runTests(): void
		{
			var beforeTime:int;
			var afterTime:int;
			var time:int;
 
			beforeTime = getTimer();
			__texture.uploadFromBitmapData(__bmdNoAlpha);
			afterTime = getTimer();
			time = afterTime - beforeTime;
			row("Texture from BitmapData w/o alpha", time, 2048*2048*4);
 
			beforeTime = getTimer();
			__texture.uploadFromBitmapData(__bmdAlpha);
			afterTime = getTimer();
			time = afterTime - beforeTime;
			row("Texture from BitmapData w/ alpha", time, 2048*2048*4);
 
			beforeTime = getTimer();
			__texture.uploadFromByteArray(__texBytes, 0);
			afterTime = getTimer();
			time = afterTime - beforeTime;
			row("Texture from ByteArray", time, 2048*2048*4);
 
			beforeTime = getTimer();
			__vertexBuffer.uploadFromVector(__vbVector, 0, 65535);
			afterTime = getTimer();
			time = afterTime - beforeTime;
			row("VertexBuffer from Vector", time, 65535*64*4);
 
			beforeTime = getTimer();
			__vertexBuffer.uploadFromByteArray(__vbBytes, 0, 0, 65535);
			afterTime = getTimer();
			time = afterTime - beforeTime;
			row("VertexBuffer from ByteArray", time, 65535*64*4);
 
			beforeTime = getTimer();
			__indexBuffer.uploadFromVector(__ibVector, 0, 524287);
			afterTime = getTimer();
			time = afterTime - beforeTime;
			row("IndexBuffer from Vector", time, 524287*4);
 
			beforeTime = getTimer();
			__indexBuffer.uploadFromByteArray(__ibBytes, 0, 0, 524287);
			afterTime = getTimer();
			time = afterTime - beforeTime;
			row("IndexBuffer from ByteArray", time, 524287*4);
		}
 
		private function row(name:String, time:int, bytes:int): void
		{
			__logger.appendText(
				__driverInfo + ","
				+ name + ","
				+ time + ","
				+ (bytes/time).toFixed(2) + "\n"
			);
		}
	}
}

 

Try out the test

 

I ran this performance test with the following environment:

 

  • Flex SDK (MXMLC) 4.5.1.21328, compiling in release mode (no debugging or verbose stack traces)
  • Release version of Flash Player 11.0.1.152
  • 2.4 Ghz Intel Core i5
  • Mac OS X 10.7.2

And got these results

 

DriverTestTimeBytes/Sec
OpenGL (Direct blitting)Texture from BitmapData w/o alpha22762600.73
OpenGL (Direct blitting)Texture from BitmapData w/ alpha18932067.56
OpenGL (Direct blitting)Texture from ByteArray18932067.56
OpenGL (Direct blitting)VertexBuffer from Vector42399451.43
OpenGL (Direct blitting)VertexBuffer from ByteArray53355392.00
OpenGL (Direct blitting)IndexBuffer from Vector3699049.33
OpenGL (Direct blitting)IndexBuffer from ByteArray12097148.00
Software (Direct blitting)Texture from BitmapData w/o alpha121398101.33
Software (Direct blitting)Texture from BitmapData w/ alpha53355443.20
Software (Direct blitting)Texture from ByteArray53355443.20
Software (Direct blitting)VertexBuffer from Vector151118464.00
Software (Direct blitting)VertexBuffer from ByteArray53355392.00
Software (Direct blitting)IndexBuffer from Vector3699049.33
Software (Direct blitting)IndexBuffer from ByteArray21048574.00

 


 

There is a clear order of speed in all tests, regardless of hardware or software or type of GPU resource being uploaded to:

  1. ByteArray (fastest)
  2. Vector
  3. BitmapData (slowest)

Only the magnitude of the advantage changes with this. In particular, if you can manage to upload a vertex or index buffer from a ByteArray, you’re assured a huge performance win.

Uploading texture data seems much faster in software compared to hardware: a 3x improvement. As for vertex and index buffers, it’s more of a mixed bag. Software is faster when uploading vertex buffers from a Vector, hardware is faster when uploading index buffers from a ByteArray, and the rest are a tie. Vertex buffers are curiously quicker to upload than index buffers. The difference is more dramatic with software rendering (3x faster) than hardware rendering (50% faster).

More so than ever before in my performance articles is it important to keep in mind that the performance results posted above are valid only for the test environment that produced them. These numbers may change on Windows, which uses DirectX instead of OpenGL, or any of a number of mobile handsets using OpenGL ES.

 

 


 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值