Cascaded Shadow Maps (CSM) on WebGPU

Usnul · June 19, 2025, 9:15pm

First some pictures

Last one is a Live Demo.

I’ve been a bit stuck on shadows while working on Shade. I didn’t want to go with traditional shadowmaps, as they require parameter tweaking. I wanted a turn-key solution, where the user/developer doesn’t need to consider anything beyond

“do I want shadows?”

Initially I thought about CSM (Cascaded Shadow Maps), but I dismissed them because as wonderful as they are - they have a bunch of drawbacks. Namely - you need to rasterize a lot of pixels, which can be memory-heavy, and depending on your scene complexity - you’re potentially vertex-shader-bound. Lastly - CSM does have a weakness that it’s still resolution-dependent, it’s still close-enough to traditional shadowmapping that you need to pick resolution for each cascade and how many cascades you want.

Next, I looked into Virtual Shadow Maps, pioneered by the Assassins Creed in 2010s, in recent years Unreal brought them back into the mainstream where they exploit their software rasterizer along with Nanite to make them a very compelling solution.

With VSM (Virtual Shadow Maps) I ran into the issue that WebGPU lacks multi-draw, and each tile of the shadow map is a separate rasterization process. For a single simple shot we might have close to a hundred such tiles. So VSM is a no-go as it currently stands. I thought about various hacks to circumvent this, but ultimately decided against it. VSM has a weird limitation still, in that it requires you specify overall shadow bounds, like like with a traditional shadow map. With the resolution in the neighbourhood of 160,000 x 160,000 pixels - it’s not a major limitation, but it’s still there.

Next, I went for ray traced shadows, and you can see a few demos in the Shade topic with RTX shadows. RTX shadows, I believe will replace shadow maps entirely at some point, as they are better in every respect except for compute requirements. But we’re not there yet. RTX is not supported natively in WebGPU, and my implementation didn’t run fast enough for my liking.

So, coming back full circle - I recently watched a presentation on “Tiny Glade” by Tom Stochastic. And Tom used CSM to a great effect. Then I remembered the "Alan Wake 2" presentation from REAC 2024, and how they mitigates some of the issues with CSM as well; I decided to give cascaded shadow maps a go.

My basic plan was to achieve similar tech solution to Tiny Glade, that is - CSM with TAA per-cascade and PCSS contact hardening.

I already had the rasterization pipeline with occlusion culling, and I did the same as the Northlight guys (Alan Wake 2), by reusing that pipeline to render CSM cascades. I dabbled a bit with PCSS, and I’m pretty happy with the results, but there are a number of issues to resolve before it can be used in production. TAA part is not there either.

That said, my implementation is a little unique:

I sample the nearest available cascade, instead of going purely by depth, as do most solutions. It might sound silly, but if you have higher resolution shadowmap available - why sample the lower res one? In my experience doing this you get ~20% shadowed pixels on the screen sampling from higher resolution cascade, which is a massive win in my book. Almost like getting 20% extra resolution for free.
Most solution out there create square frustum for each cascade, using the widest dimension of the frustum slice’s bounds. Many implementations I’ve seen even take the hypotenuse, which is even larger. This eliminates pretty much any possibility that there will be lack of coverage for any cascade. However, you sacrifice resolution. Again, in my experience, frustums can be much tighter, often 1/4 of the area of the conservative approach. Which means you can get, again, about double the resolution for your shadows. I use tightest bounds possible.
Scabilization. I used to think that stabilization is very important, but after reading Matt Pettineo’s musings on CSMs, I was struck by the following quote:

2 years ago Andrew Lauritzen gave a talk on what he called “Sample Distribution Shadow Maps”, and released the sample that I mentioned earlier. He proposed that instead of stabilizing the cascades, we could instead focus on reducing wasted resolution in the shadow map to a point where effective resolution is high enough to give us sub-pixel resolution when sampling the shadow map.

Thinking about it in a different way - if you have a wide-enough filter, you get the temporally stable results. So, I don’t stabilize cascades. Going beyond that, with a decent PCSS implementation, the problem becomes even less significant, as your filter size grows. In the current implementation I’m using a Catmull-Rom filter of size 5.

Instead of using a texture array for CSM, I use an atlas instead. This is not a huge deal really. I just wanted to build for the future, where more than just CSM is supported, and I can draw all shadows into the same atlas for speed of access during shading.

My GPU metadata for CSM looks very simple, something like this:

struct CascadeMetadata{
    atlas_patch: vec4<f32>,
    projection: mat4x4<f32>
}

var<uniform> csm_metadata: array<CascadeMetadata, CSM_CASCADE_COUNT>

where the CSM_CASCADE_COUNT is a compile-time constant. In the demo it’s 4.

If you check out the Live Demo, please let me know how it runs for you and what hardware/resolution you’ve got.

Useful references in no particular order

manthrax · June 20, 2025, 6:25am

awesome writeup. so much to digest!! I think I will be trying to understand all this for a long time. Love deep dives like this.

Usnul · June 23, 2025, 10:06pm

Wrote a little tool for debugging cascade bounds, I’m going to share it in case anyone is interested. It references meep, so will need some adapting if you want to use it. The license is MIT, so go nuts.

import { array_swap } from "@woosh/meep-engine/src/core/collection/array/array_swap.js";
import { Color } from "@woosh/meep-engine/src/core/color/Color";
import { okhsv_to_linear_srgb } from "@woosh/meep-engine/src/core/color/oklab/okhsv_to_linear_srgb.js";
import AABB2 from "@woosh/meep-engine/src/core/geom/2d/aabb/AABB2.js";
import { frustum_compute_corners } from "@woosh/meep-engine/src/core/geom/3d/frustum/frustum_compute_corners.js";
import { frustum_slice } from "@woosh/meep-engine/src/core/geom/3d/frustum/frustum_slice.js";
import { m4_invert } from "@woosh/meep-engine/src/core/geom/3d/mat4/m4_invert.js";
import Vector3 from "@woosh/meep-engine/src/core/geom/Vector3.js";

function frustum_to_lines({
                              frustum,
                              color,
                          }) {

    const lines = [];

    const frustum_corners = new Float32Array(24);
    frustum_compute_corners(frustum_corners, 0, frustum, 0);

    for (const [i0, i1] of [
        [0b000, 0b001],
        [0b000, 0b010],
        [0b000, 0b100],
        [0b001, 0b011],
        [0b001, 0b101],
        [0b010, 0b011],
        [0b010, 0b110],
        [0b100, 0b101],
        [0b100, 0b110],
        [0b011, 0b111],
        [0b110, 0b111],
        [0b101, 0b111],
    ]) {
        const v0 = Vector3.fromArray(frustum_corners, i0 * 3);
        const v1 = Vector3.fromArray(frustum_corners, i1 * 3);


        lines.push({ points: [v0, v1], color });
    }

    return lines;
}

function frustum_to_polygons({
                                 frustum,
                                 color,
                             }) {

    const polygons = [];

    const frustum_corners = new Float32Array(24);
    frustum_compute_corners(frustum_corners, 0, frustum, 0);

    for (const [i0, i1, i2, i3] of [
        // X0
        [0b000, 0b001, 0b010, 0b011],
        // X1
        [0b100, 0b101, 0b110, 0b111],
        // Y0
        [0b000, 0b100, 0b101, 0b001],
        // Y1
        [0b010, 0b110, 0b111, 0b011],
        // Z0
        [0b000, 0b100, 0b110, 0b010],
        // Z1
        [0b001, 0b011, 0b111, 0b101],
    ]) {
        const v0 = Vector3.fromArray(frustum_corners, i0 * 3);
        const v1 = Vector3.fromArray(frustum_corners, i1 * 3);
        const v2 = Vector3.fromArray(frustum_corners, i2 * 3);
        const v3 = Vector3.fromArray(frustum_corners, i3 * 3);


        polygons.push({ points: [v0, v1, v2, v3], color });
    }

    return polygons;
}

/**
 *
 * @param {CanvasRenderingContext2D} canvas_ctx
 * @param {Vector2} size
 * @param {Camera} main_camera
 * @param {Shadowmap} shadowmap
 */
export function csm_debug_draw_preview({
                                           canvas_ctx,
                                           size,
                                           main_camera,
                                           shadowmap
                                       }) {
    canvas_ctx.clearRect(0, 0, size.x, size.y);
    canvas_ctx.fillStyle = 'rgba(0,0,0,0.5)';
    canvas_ctx.fillRect(0, 0, size.x, size.y);

    // build list of polygons to draw
    const vertices = [];
    const lines = [];

    Array.prototype.push.apply(lines, frustum_to_lines({
        frustum: main_camera.frustum,
        color: Color.white
    }));

    const polygons = [];
    for (let i = 0; i < shadowmap.views.length; i++) {
        const view = shadowmap.views[i];
        const camera = view.camera;

        const color = new Color();
        okhsv_to_linear_srgb(color, i / shadowmap.views.length, 0.8, 1);
        color.a = 0.1;


        Array.prototype.push.apply(polygons, frustum_to_polygons({
            frustum: camera.frustum,
            color
        }));

    }

    // drap splits
    for (let i = 0; i < shadowmap.views.length; i++) {
        const main_frustum = new Float32Array(main_camera.frustum);

        {
            // swap z planes (reverse Z projection)
            array_swap(main_frustum, 16, main_frustum, 20, 4);
        }
        const cascade_frustum_slice = new Float32Array(24);

        const prev_split_distance = i > 0 ? shadowmap.splits[i - 1] : 0;
        const split_distance = shadowmap.splits[i];
        frustum_slice(
            cascade_frustum_slice, 0,
            main_frustum,
            0, 0, prev_split_distance,
            1, 1, split_distance
        );

        const color = new Color();
        okhsv_to_linear_srgb(color, i / shadowmap.views.length, 1, 1);
        color.a = 1;


        Array.prototype.push.apply(lines, frustum_to_lines({
            frustum: cascade_frustum_slice,
            color
        }));
    }

    Array.prototype.push.apply(vertices, lines.map(l => l.points).flat());
    Array.prototype.push.apply(vertices, polygons.map(l => l.points).flat());

    const projection_matrix = new Float32Array(16);
    m4_invert(projection_matrix, shadowmap.light.transform_global);

    // project
    for (let i = 0; i < vertices.length; i++) {
        const vertex = vertices[i];
        vertex.applyMatrix4(projection_matrix);
    }

    // compute bounds
    const bounds = new AABB2();
    bounds.setNegativelyInfiniteBounds();
    for (let i = 0; i < vertices.length; i++) {
        const v = vertices[i];
        bounds._expandToFitPoint(v.x, v.y);
    }


    // transform vertices into canvas space
    for (let i = 0; i < vertices.length; i++) {
        const v = vertices[i];

        v._sub(bounds.x0, bounds.y0, 0);
        v._divide(bounds.width, bounds.height, 1);
        v._multiply(size.x, size.y, 1);
    }

    canvas_ctx.fillStyle = "none";
    canvas_ctx.lineWidth = 1;
    for (let i = 0; i < lines.length; i++) {
        const line = lines[i];

        const v0 = line.points[0];
        const v1 = line.points[1];
        canvas_ctx.beginPath();

        canvas_ctx.strokeStyle = line.color.toCssRGBAString();
        canvas_ctx.moveTo(v0.x, v0.y);
        canvas_ctx.lineTo(v1.x, v1.y);

        canvas_ctx.stroke();
    }

    // draw polygons
    for (let i = 0; i < polygons.length; i++) {
        const polygon = polygons[i];
        canvas_ctx.fillStyle = `rgba(${polygon.color[0] * 255},${polygon.color[1] * 255},${polygon.color[2] * 255},${polygon.color[3]})`;
        canvas_ctx.beginPath();
        canvas_ctx.moveTo(polygon.points[0].x, polygon.points[0].y);
        for (let j = 1; j < polygon.points.length; j++) {
            const v = polygon.points[j];
            canvas_ctx.lineTo(v.x, v.y);
        }
        canvas_ctx.lineTo(polygon.points[0].x, polygon.points[0].y);
        canvas_ctx.fill();
    }
}

Helped me nail down a few issues, here’s what that looks like in an overlay:

The main (view) camera frustum is projected into light space, you can see individual slices (cascades), along with the shadowmap bounds.

Just for fun, here is a progressive set of screenshots to demonstrate level of fidelity CSM gives you:

this knife is about 1 meter long

same knife is in the center

Topic		Replies	Views
R168 WebGPU - Chasing Shadows - fixed in r169 Questions shadows , webgpu	37	646	September 17, 2024
Cascaded Shadow Maps Discussion shadows , shadow-map , cascaded-shadow-maps , csm	6	2894	August 6, 2019
Shade - WebGPU graphics Showcase webgpu , rendering	97	10770	August 31, 2025
Cascaded shadow maps! Showcase shadows	17	3023	January 28, 2020
Clustered Rendering on WebGPU Showcase shaders , lights , webgpu , clustered-rendering	5	433	June 20, 2025

Cascaded Shadow Maps (CSM) on WebGPU

Useful references in no particular order

Related topics