Cascaded Shadow Mapping

Uwaga! Informacje na tej stronie mają ponad 5 lat. Nadal je udostępniam, ale prawdopodobnie nie odzwierciedlają one mojej aktualnej wiedzy ani przekonań.

# Cascaded Shadow Mapping

Sun
12
Jul 2009

When I was writing The Final Quest engine for my master thesis, I didn't manage to implement any technique to ensure good quality shadows in outdoor scenes. I've read about all these kinds of perspective reparametrization like PSM (Perspective Shadow Maps), LiSPSM (Light-Space Perspective Shadow Maps), TSM (Trapezoidal Shadow Maps) or XPSM (Extended Perspective Shadow Maps) and all that math behind them seemed very scary to me. Today I know that complexity of these techniques also causes some artifacts in particular cases and commercial games more often use simpler technique called CSM (Cascaded Shadow Maps).

Yesterday I've implemented CSM and I'm very glad with the results. Of course there are always some artifacts, aliasing problems and z-acne on some distant objects under particular surface angles, performance degradation (additional objects rendering to 3 x 1024 x 1024 textures must take some time) etc., but still its the first time I have not so bad outdoor shadows in my code. On the screenshot below you can see transitions between cascades marked with red arrows.

Cascaded Shadow Mapping

Let me explain briefly how does it work. Here are some constants and variables:

const uint SHADOW_MAP_CASCADE_COUNT = 3;
const UINT SHADOW_MAP_SIZE = 1024;
const D3DFORMAT SHADOW_MAP_FORMAT = D3DFMT_R32F;

IDirect3DTexture9 *g_ShadowMapTextures[SHADOW_MAP_CASCADE_COUNT] = { NULL };
IDirect3DSurface9 *g_ShadowMapSurfaces[SHADOW_MAP_CASCADE_COUNT] = { NULL };

CSM is all about splitting the view frustum into several frusta, where each has same parameters except near and far clipping plane. For example if my camera near distance is 0.5, far distance is 100 and I have 3 cascades, subsequent cascades can have near...far distances like: 0.5 ... 7, 7 ... 25, 25 ... 100. First I calculate distance of far clipping plane for each cascade:

float shadowMappingSplitDepths[SHADOW_MAP_CASCADE_COUNT];
CalcShadowMappingSplitDepths(shadowMappingSplitDepths, camera);

void CalcShadowMappingSplitDepths(float *outDepths, Camera &camera)
{
  float camNear = camera.GetZNear();
  float camFar  = std::min(camera.GetZFar(), g_ShadowMaxDist);

  float i_f = 1.f, cascadeCount = (float)SHADOW_MAP_CASCADE_COUNT;
  for (uint i = 0; i < SHADOW_MAP_CASCADE_COUNT-1; i++, i_f += 1.f)
  {
    outDepths[i] = Lerp(
      camNear + (i_f/cascadeCount)*(camFar - camNear),
      camNear * powf(camFar / camNear, i_f/cascadeCount),
      g_ShadowSplitLogFactor);
  }
  outDepths[SHADOW_MAP_CASCADE_COUNT-1] = camFar;
}

The magic formula I used here comes from NVIDIA document Cascaded Shadow Maps. It just calculates something between uniform and exponential distribution of splitting planes across camNear to camFar range, with interpolation factor g_ShadowSplitLogFactor set by the user (I've found 0.9 working best for me).

For each cascade I build temporary camera with same parameters as real one, but with new near and far clipping planes. Then I enumerate all objects on scene that intersect this temporary camera frustum. But hey! There may be some objects which are not visible in that frustum, but lie on the path from the sun to the visible part of the scene so they have to be rendered too to be able to cast shadow. Solution for this problem is to enumerate objects that intersect shape of that frustum moving to infinity in the direction pointing to the light source. As swept intersections are much more sophisticated that normal intersections, I don't have procedure for swept frustum test, so I calculate bounding box of that frustum and use swept box intersection test.

SceneObjectPtrVector objs;
MATRIX shadowMapViewProjs[SHADOW_MAP_CASCADE_COUNT];
MATRIX shadowMapTexXforms[SHADOW_MAP_CASCADE_COUNT];
for (uint cascade_i = 0; cascade_i < SHADOW_MAP_CASCADE_COUNT; cascade_i++)
{
  ParamsCamera tmpCamera = camera.GetParams();
  tmpCamera.SetZFar(shadowMappingSplitDepths[cascade_i]);
  if (cascade_i > 0)
    tmpCamera.SetZNear(shadowMappingSplitDepths[cascade_i-1]);
  const BOX &frustumBB = tmpCamera.GetMatrices().GetFrustumBox();
  scene.ListObjectsIntersectingSweptBox(objs, frustumBB, g_DirToLight);

  CalcShadowMapMatrices(
    shadowMapViewProjs[cascade_i],
    shadowMapTexXforms[cascade_i],
    tmpCamera, objs);
  RenderShadowMap(cascade_i, scene, objs, shadowMapViewProjs[cascade_i]);
}

What are the mysterios two matrix arrays and how do I calculate them? shadowMapViewProjs[cascadeIndex] represents transformation from world space to shadow map space, so it transforms geometry to the Sun's point of view. As every world to viewport transform, I assemble it from two matrices - view and projection. I use orthogonal (not perspective) projection here. I build view matrix from three orthogonal, normalized axes, where Z is the light direction and other two can be arbitrary.

void CalcShadowMapMatrices(
  MATRIX &outViewProj,
  MATRIX &outShadowMapTexXform,
  const ParamsCamera &camera, SceneObjectPtrVector &objs)
{
  const VEC3 *upDir = &VEC3_POSITIVE_Y;
  if (fabsf(Dot(g_DirToLight, *upDir)) > 0.99f)
    upDir = &VEC3_POSITIVE_Z;
  
  VEC3 axisX, axisY, axisZ = -g_DirToLight;
  Cross(&axisX, axisZ, *upDir);
  Normalize(&axisX);
  Cross(&axisY, axisX, axisZ);
  
  MATRIX view;
  AxesToMatrix(&view, axisX, axisY, axisZ);
  Transpose(&view);

Now it's time for the second matrix. It is an orthogonal projection focusing on geometry of our interest. For the left to right and top to bottom boundaries I use bounds of the camera frustum, not the smallest box bounding all visible objects. Thanks to that I can avoid some popping when new objects enter viewing area. I still have to visit all objects to determine minimum and maximum Z.

MATRIX proj;
if (objs.empty())
  Identity(&proj);
else
{
  BOX xformedFrustumBB;
  TransformBox(&xformedFrustumBB, camera.GetMatrices().GetFrustumBox(), view);
  float shadowMapSizeF = (float)SHADOW_MAP_SIZE;

  float objsNear = FLT_MAX, objsFar = -FLT_MAX;
  VEC3 boxCorners[8];
  for (size_t i = 0; i < objs.size(); i++)
  {
    const BOX &objBB = objs[i]->GetWorldBoundingBox();
    objBB.GetAllCorners(boxCorners);
    TransformArray(boxCorners, _countof(boxCorners), view);
    for (uint corner_i = 0; corner_i < 8; corner_i++)
    {
      if (boxCorners[corner_i].z < objsNear) objsNear = boxCorners[corner_i].z;
      if (boxCorners[corner_i].z > objsFar ) objsFar  = boxCorners[corner_i].z;
    }
  }
  OrthoOffCenterLH(&proj,
    xformedFrustumBB.p1.x, // left
    xformedFrustumBB.p2.x, // right
    xformedFrustumBB.p2.y, // bottom
    xformedFrustumBB.p1.y, // top
    objsNear, // near
    objsFar); // far
}

outViewProj = view * proj;

The outViewProj matrix will be used for rendering objects to shadow map as a render target. Now calculation of the second matrix remain. It is similar to outViewProj, but it will be used to sample shadow map as a texture. The only difference is that we have to transform vertices to texture coordinates where x and y are 0...1, not -1...1 as it was in rendering to shadow map. I also apply depth bias here to avoid self-shadowing and minimize z-acne effect. Doing both is just the matter of applying additional, simple correction transformation to the matrix I calculated before:

MATRIX viewportToTex = MATRIX(
  0.5f,  0.f,  0.f, 0.f,
  0.f,  -0.5f, 0.f, 0.f,
  0.f,   0.f,  1.f, 0.f,
  0.5f,  0.5f, g_ShadowMapDepthBias, 0.f);
outShadowMapTexXform = outViewProj * viewportToTex;

That's it! Now I can render objects to shadow map, so for each cascade I run that function:

void RenderShadowMap(
  uint cascadeIndex, Scene &scene,
  SceneObjectPtrVector &objs, const MATRIX &viewProj)
{
  // Set render target to g_ShadowMapSurfaces[cascadeIndex]

  // Clear Z-buffer and shadow map to max distance
  frame::Dev->Clear(0, NULL, D3DCLEAR_TARGET | D3DCLEAR_ZBUFFER, 0xFFFFFFFF, 1.f, 0);
  
  for (size_t i = 0; i < objs.size(); i++)
    // Render object objs[i] using trasform = (objs[i]->WorldMatrix * viewProj)
}

Vertex shader used to render objects to shadow map is very simple. Shadow map is a texture in format like D3DFMT_R32F, D3DFMT_R16F or D3DFMT_D3DFMT_G16R16, where I use only R component.

float4x4 g_WorldViewProj;

struct APP_TO_VS
{
  float3 Pos : POSITION;
};

struct VS_TO_PS
{
  float4 Pos : POSITION;
  float1 Depth : TEXCOORD0;
};

void VS(in APP_TO_VS In, out VS_TO_PS Out)
{
  Out.Pos = mul(float4(In.Pos, 1), g_WorldViewProj);
  Out.Depth = Out.Pos.z;
}

void PS(in VS_TO_PS In, out float4 Out : COLOR0)
{
  Out = In.Depth;
}

Now it's time to render final image. There is nothing special about draw calls so I will show only how do I pass shadow mapping parameters to my main shader:

SceneObjectPtrVector objs;
scene.ListObjectsIntersectingFrustum(objs, camera.Frustum);
RenderContext renderContext = {
  camera, useShadowMapping, shadowMapTexXforms, shadowMappingSplitDepths };
for (size_t i = 0; i < objs.size(); i++)
  RenderSceneObject(*objs[i], renderContext);

void RenderSceneObject(SceneObject &sceneObj, const RenderContext &renderContext)
{
  //...
  effect->SetVector("g_ShadowMapSizes", &VEC4((float)SHADOW_MAP_SIZE, 1.f/SHADOW_MAP_SIZE, 0.f, 0.f));
  effect->SetVector("g_CascadeDepths", &VEC4(
    renderContext.CascadeDepths[0],
    renderContext.CascadeDepths[1],
    renderContext.CascadeDepths[2],
    0.f));
  string paramName;
  for (uint cascade_i = 0; cascade_i < SHADOW_MAP_CASCADE_COUNT; cascade_i++)
  {
    paramName = Format("g_ShadowMapTexture#") % cascade_i;
    effect->SetTexture(paramName.c_str(), g_ShadowMapTextures[cascade_i]);
    paramName = Format("g_ShadowMapXform#") % cascade_i;
    effect->SetMatrix(paramName.c_str(), &(sceneObj.World*renderContext.ShadowMapTexXforms[cascade_i]));
  }

And here is shader responsible for doing all main stuff, including sampling shadow maps:

float3 g_CascadeDepths;
texture g_ShadowMapTexture0;
texture g_ShadowMapTexture1;
texture g_ShadowMapTexture2;
float4x4 g_ShadowMapXform0;
float4x4 g_ShadowMapXform1;
float4x4 g_ShadowMapXform2;
float g_ShadowFactor;
float2 g_ShadowMapSizes; // Size, 1/Size
sampler ShadowMapSampler0 = sampler_state {
  Texture = (g_ShadowMapTexture0); BorderColor = 0xFFFFFF;
  AddressU = BORDER; AddressV = BORDER;
  MinFilter = POINT; MagFilter = POINT; MipFilter = NONE;
};
sampler ShadowMapSampler0 = sampler_state { ...
sampler ShadowMapSampler1 = sampler_state { ...

Structure passed from vertex to pixel shader includes special interpolants for shadow mapping:

struct VS_TO_PS
{
  float3 Pos_obj    : TEXCOORD1; // In object space
  float  Pos_view_z : TEXCOORD2; // In view space
  ...

Vertex shader fills them like this:

Out.Pos = mul(float4(In.Pos, 1), g_WorldViewProj);
...
Out.Pos_obj = In.Pos;
Out.Pos_view_z = Out.Pos.w;

Pixel shader calculates value NoShadow = 0..1, which is later used in lighting. Unfortunately, I didn't manage to make HLSL compiler use dynamic branching instead of flattening all these calculations.

float NoShadow;

if (In.Pos_view_z > g_CascadeDepths.z) // Above shadow max dist
  NoShadow = 1;
else if (In.Pos_view_z > g_CascadeDepths.y) // Cascade 2
  NoShadow = CalcShadow(ShadowMapSampler2,
    mul(float4(In.Pos_obj, 1), (float4x3)g_ShadowMapXform2));
else if (In.Pos_view_z > g_CascadeDepths.x) // Cascade 1
  NoShadow = CalcShadow(ShadowMapSampler1,
    mul(float4(In.Pos_obj, 1), (float4x3)g_ShadowMapXform1));
else // Cascade 0
  NoShadow = CalcShadow(ShadowMapSampler0,
    mul(float4(In.Pos_obj, 1), (float4x3)g_ShadowMapXform0));

And here is the CalcShadow function.

// texCoord.xy = tex coord
// texCoord.z = depth
float CalcShadow(sampler smSampler, float3 texCoord)
{
  // Percentage Closest Filtering
  float2 tmpTexCoord = texCoord.xy * g_ShadowMapSizes.x;
  float4 smTexCoord;
  smTexCoord.xy = floor(tmpTexCoord);
  float2 lerpFactors = tmpTexCoord - smTexCoord.xy;
  smTexCoord.zw = smTexCoord.xy + float2(1, 1);
  float4 smSamples;
  smTexCoord *= g_ShadowMapSizes.y;
  smSamples.x = tex2D(smSampler, smTexCoord.xy).r;
  smSamples.y = tex2D(smSampler, smTexCoord.zy).r;
  smSamples.z = tex2D(smSampler, smTexCoord.xw).r;
  smSamples.w = tex2D(smSampler, smTexCoord.zw).r;
  float4 cmpResults = smSamples < texCoord.zzzz;
  float2 lerpTmp = lerp(cmpResults.xz, cmpResults.yw, lerpFactors.x);
  return 1 - lerp(lerpTmp.x, lerpTmp.y, lerpFactors.y);
}

That's it! Maybe that was not a comprehensive tutorial, but still I hope some of you can make use of the code I've shared here.

Comments | #directx #rendering Share

Comments

STAT NO AD
[Stat] [STAT NO AD] [Download] [Dropbox] [pub] [Mirror] [Privacy policy]
Copyright © 2004-2019