每天一篇GDC： Optimized Stenciled Shadow Volumes_剔除标准 elimination criteria-优快云博客

本文链接：https://blog.youkuaiyun.com/kevin_dust/article/details/51903043

图片稍后补上。。
Optimized Stenciled Shadow Volumes
首先当然先说说这事神马：
Shadow Volume其实是区别ShadowMapping的一种阴影生成方法
首先分两个步骤：
1、生成阴影体
2、对阴影进行渲染

为了简要说明shadow Volume的原理，接下来先介绍该方法是如何对阴影进行渲染

一、阴影渲染（ZPass && ZFail）
1、ZPass：
其实这个方法很简单，也很迅速：

算法：
（1）、先取得整个场景的深度值（此时没必要进行光照）
（2）、接下来，我们根据深度值，来做一些列判断，so关闭深度写
（3）、好了，开始渲染了，不过我们不是直接渲染阴影，而是先渲染阴影体..
（4）我们以视点到观察地方的顺序渲染阴影体
（5）、对于某个像素来说，如果阴影体正面的深度测试通过，则该像素的模版值+1，如果阴影体背面的深度测试通过，则-1
（6）、最后，模板值不为0的就在阴影体中..
（7）、此时重新开启深度写，用模板的方法对整个场景重新渲染（加上光照）
（8）、不渲染模板值不为0的阴影部分即可（用alpha blend）
缺陷：

（1）、整个算法建立在视点不在阴影体中，若视点在阴影体中，该算法失效

（2）、当Z-near clip plane过近，由于视锥剔除，阴影体的一部分可能被剔除，就会导致模板计数错误
2、ZPass：
算法：
（1）、先取得整个场景的深度值（此时没必要进行光照）
（2）、接下来，我们根据深度值，来做一些列判断，so关闭深度写
（3）、好了，开始渲染了，不过我们不是直接渲染阴影，而是先渲染阴影体..
（4）、我们以观察地方到视点的顺序渲染阴影体（与ZPass相反）
（5）、对于某个像素来说，如果阴影体背面的深度测试通过，则该像素的模版值+1，如果阴影体正面的深度测试通过，则-1
（6）、最后，模板值不为0的就在阴影体中..
（7）、此时重新开启深度写，用模板的方法对整个场景重新渲染（加上光照）
（8）、不渲染模板值不为0的阴影部分即可（用alpha blend）
缺陷：
（1）、要求阴影体积必须封口

    （2）、当Z-far clip plane过近，由于视锥剔除，阴影体的一部分可能被剔除，就会导致模板计数错误

二、阴影体生成–见shadowVolume优化的第五点
三、Shadow Volume的优化
1、Fill Rate Optimizations
（1）Zpass VS ZFail：（可以根据不同情况选择来用）
a、Zpass不需要封口，而且加上遮挡剔除，他的速度很快。但是当阴影体与nearPlane相交，更有甚者，视点在阴影体里面（这时候更加相交），就跪了
b、ZFail就慢一点咯..而且要封口..
对于每个occluder（遮掩物），我们都可以分别选择ZPass，还是ZFail。
如何选择：

以光源，nearPlane做一个三角面，如果不在里面，则Zpass,否则ZFail
（2）Exploit Bounds：（就是弄个判断，没必要的就别弄了）
··（光照裁剪算法。。。）

解决方案：Depth Bounds Test ：根据ZMax和ZMin判断

2、Culling Optimizations
(1)、Shadow volume culling
传统的culling做法：
Use conservative bounding geometry（保守边界：封口） to efficiently determine when object is outside the view frustum
Shadow volume culling（看不见不代表不投射阴影）：
新的剔除标准：

(2)、Portal-based culling
For bounded lights, we can treat the light bounds as the “visible object” we’re testing for
If the light bounds are visible, we need to process the light
If the light bounds are invisible, we can safely skip the light
3、Silhouette Determination Optimizations\
(1)、对于静态的遮掩物和光源：可以提前算好阴影体
Precomputation of shadow volumes can include bounding shadow volumes to the light bounds
(2)、对于静态的遮掩物和光源：采用有效的数据结构
But static occluders could exploit precomputation
See SIGGRAPH 2000 paper “Silhouette Clipping” by Sander, Gui, Gortler, Hoppe, and Snyder
Check out the “Fast Silhouette Extraction” section
Data structure is useful for fast shadow volume possible silhouette edge determination
(3)、简化遮掩物模型
A、越复杂的模型就会出现更多的silhouette edges
B、三角面越多，就要花更多时间搜索silhouette edges
C、越多silhouette edges ，shadow volume就要消耗更多填充率（绘制更多的点）

4、Shadow Volume Rendering Optimizations
（1）、Compute Silhouette Loops on CPU
Only render silhouette for zpass
Vertex transform sharing
（2）、Extrude Triangles instead of quads
避免Shadow Volume冗余的变化
GL_QUAD_STRIP rather than GL_QUADS
需要一个确定的由所有silhouette组成的循环
一旦找到一个就可以直接遍历一次循环
Shadow Volume Extrusion using Triangle Fans\
（3）、Wrapping stencil to avoid overflows
（4）、Two-sided stencil testing
First, rasterizing front-facing geometry
Second, rasterizing back-facing geometry

wo sets of stencil state: front- and back-facing
Rasterizes just as many fragments,but more efficient for CPU & GPU
（5）、Vertex programs for shadow volume rendering
（1）、Fully automatic shadow volume extrusion
Everything off-loaded to GPU，This needs a LOT of extra vertices
Quite inefficient, not recommended
No way to do zpass cap optimizations
No way to do triangle extrusion optimizations

（2）、Vertex normal-based extrusion
If NL is greater or equal to zero, leave vertex alone
If NL is less than zero, project vertex away from light
Worst on faceted, jaggy, or low-polygon count geometry
No way to do zpass cap optimizations
No way to do triangle extrusion optimizations

顶点的法线是典型的为了光照计算，而不是阴影：
顶点的法线可以很好地表达弯曲程度，但不是方向
阴影应该取决于细小平面的方向
这里的法线应该是为了一个三角形（平面的法线），而不是一个顶点的法线
一种乱来方法
在同一位置放入多个法线不同的顶点
（3）、Semi-automatic shadow volume extrusion（通过CPU计算）
a unique set of possible silhouette edge of vertices per light per model

A、遍历模型的所有三角形
B、计算dot3( light_direction , triangle_normal ) 。用这个结果判断三角形是面向光源(dot3>0) 还是背向光源(dot<0) 。
C、对于面向光源的三角形，将所有的三条边压入一个栈，和里面的边进行比较，如果发现重复的(edge1和edge2) ，将这些边删除
D、检测过所有三角形的所有边以后，栈里面剩下的边就是当前光源/
E、物体位置下面的silhouette edge.
F、根据光源方向, 利用CPU 或者vertex shader 将这些silhouette edge 投射出去形成shadow volume.
特点：
Make two copies of every vertex,each with 4 components (x,y,z,w)
Straightforward to handle caps too
Vertex array memory required = 8 floats / vertex
Independent of number of lights
Vertex program required is very short
Much less vertex array memory than fully automatic approach
5、Shadow Volume Polygon Rendering Order
Naïve approach
Simply render cap & projection shadow volume polygons in “memory order” of the edges and polygons in CPU memory(以内存存放顺序渲染)
Disadvantages
Potentially poor rasterization locality for stencil buffer updates
Typically sub-par vertex re-use
Advantages
Friendly to memory pre-fetching Obvious, easy to implement approach
GPU Optimized Approach
Possible silhouette edges form loops
Render the projected edge shadow volume quads in “loop order”
（刚刚说到的，弄一个循环，以此为顺序）
When you must render finite and infinite caps
Greedily search for adjacent cap polygons
Continue until the cap polygons bump into possible silhouette edge loops – then look for more un-rendered capping polygons
Advantages：
Tends to maximize vertex re-use
Avoids retransforming vertices multiple times due to poor locality of reference
Maximizes stencil update efficiency
Adjacent polygons make better use of memory b/w
Convenient for optimizations
zpass cap elimination
triangle instead of quad extrusion
Easy to implement
Once you locate a possible silhouette edge, it’s easy to follow the loop
Easy to greedily search for adjacent finite & infinite cap polygons