Testing WGSL source code

Basic idea. I have some WGSL code:

var<private> rnd_state : u32 = 2891336453u;
    

fn pcg(v: u32) -> u32 {

  let state = v * 747796405u + 2891336453u;
  let word = ((state >> ((state >> 28u) + 4u)) ^ state) * 277803737u;
  
  return (word >> 22u) ^ word;
  
}
    

fn random_uint() -> u32{
  rnd_state = pcg(rnd_state);
  
  return rnd_state;
}
    

fn hash_to_float01( hash:u32 ) -> f32{
    return bitcast<f32>(0x3f800000 | (hash >> 9)) - 1.0f;
}
    

fn random() -> f32{
    
    let h = random_uint();
    
    return hash_to_float01( h );

}

This is a snippet from Shade, the bit that we care about here is the random function.

We test this by writing a bog standard JavaScript unit test, no WebGPU support required:

test('random() returns a finite number in [0, 1) and successive calls differ', () => {

    const lib = ComputeShaderEmulator.fromCodeChunk(wgsl_source, { parser });

    const a = lib.random();
    const b = lib.random();
    const c = lib.random();

    for (const v of [a, b, c]) {
        expect(typeof v).toBe('number');
        expect(Number.isFinite(v)).toBe(true);
        expect(v).toBeGreaterThanOrEqual(0);
        expect(v).toBeLessThan(1);
    }

    expect(a).not.toBe(b);
    expect(b).not.toBe(c);
    expect(a).not.toBe(c);
});

And this works, it runs and it passes.

This is something I have been thinking of for a very long time, I even wrote on the idea here a while ago.


How does this work?

The basic idea is to translate WGSL into JavaScript, we have a compiler that takes wgsl string and spits out a JS string like this:

(wgsl) => {
    let rnd_state = 2891336453;
    function pcg(v) {
        const state = wgsl.add(wgsl.mul(v, 747796405), 2891336453);
        const word = wgsl.mul(((state >> (wgsl.add((state >> 28), 4))) ^ state), 277803737);
        return (word >> 22) ^ word;
    }
    function random_uint() {
        rnd_state = pcg(rnd_state);
        return rnd_state;
    }
    function hash_to_float01(hash) {
        return wgsl.sub(wgsl.bitcast(0x3f800000 | (hash >> 9), "f32", "u32"), 1.0);
    }
    function random() {
        const h = random_uint();
        return hash_to_float01(h);
    }
    const __module__ = {};
    __module__.pcg = pcg;
    __module__.random_uint = random_uint;
    __module__.hash_to_float01 = hash_to_float01;
    __module__.random = random;
    Object.defineProperty(__module__, "rnd_state", { get: () => rnd_state, set: (v) => { rnd_state = v; } });
    return __module__;
}

The one important point here is that the JavaScript source looks very close to WGSL. Meaning that if our tests don’t pass - we can step through the source code with the debugger.


Shade is written with a module abstraction for WGSL code, I call it CodeChunk, here’s the one for random:

import { CodeChunk } from "../../compiler/CodeChunk.js";
import { chunk_hash_to_float01 } from "./chunk_hash_to_float01.js";
import { chunk_random_uint } from "./chunk_random_uint.js";

/**
 * A pseudo random number.
 * Generate a random float in [0,1) range
 */
export const chunk_random = CodeChunk.from(
//language=WGSL
    `
fn random() -> f32{
    
    let h = random_uint();
    
    return hash_to_float01( h );

}
    `,
    [
        chunk_random_uint,
        chunk_hash_to_float01,
    ]
)

All this, to say that the way the code is laid out lends itself well to unit testing.

Note on the example:

The random() was picked because it’s small-enough to be illustrative, but at the same time, it uses var<private> storage space, it performs chained function calls and even does bitcasting. These are all fairly complex mechanisms to faithfully replicate in JavaScript.


FAQ

What about complete shaders?

Complete shaders are supported, for now only compute though. Dispatch mechanism is per-thread. This is still a debug / testing tool primarily.

Limitations?

Surprisingly few, even with this first version:

  • no subgroup support
  • no 3d texture support

Is this a cross-compiler?

Yes, essentially, with a small runtime.

3 Likes

This looks awesome!

Did you manage to solve the issue of replicating math operations? I’m currently writing a pathtracer and do struggle with debugging. However its the kind where I don’t really know actual math happening not which branches are taken.

Did you look at webgpu_inspector’s debugger? It seems to have one, but my compute shaders never work for some reason :sad_but_relieved_face: It seems to be a good combination for debugging of capture commands and buffers + debug shaders on the actual data.

All math ops are supported I believe. Including more obscure ones like frexp

I happen to have a very extensive math core in my Meep Engine, which trivializes this translation.

There are discrepancies, of course, as I’m not really doing full emulation, merely a best effort translation. So things like a + b will still be done in 64 bit precision, as that’s what JS runs in, unlike standard WGSL precision, which is 32 bit.

Or, to give another example, sin in WGSL vs Math.sin - they are not the same, but we pretend that they are.

I assume you mean this:

I don’t believe I have tried it. Looks interesting.

What I’m doing is different in initial purpose, it is primarily for automated testing, that is - writing unit tests for WGSL code that can run as part of the standard vitest / jest configuration. Then, if a test fails - I can step through the code within the IDE and debug it.

Was wondering about those kind of issues yeah, because they can hard to track or think about. I guess there’s only so much one can do to emulate.

I understand, automated testing with debugging for gpu code could be a massive help. It just seemed natural to try creating tests from real data. e.g. if something doesn’t work right: capture resources, create a test for pixel’s data.

Is source code available somewhere to take a look?

This is the kind of tooling I keep wishing existed for compute work. Iterating with normal node tests is so much faster than the GPU round trip. One question: does ComputeShaderEmulator try to model barrier/sync behavior at all, or is it strictly per-invocation? I’d guess per-invocation for unit-test purposes, but curious where you drew the line.

Yep, that’s where I drew the line. It only models behavior of a single thread. Essentially we just treat the

@compute
fn main(...){}

As a function, where we are free to supply input parameters.

Memory barriers and other cooperative constructs would only make sense if I was emulating this at instruction level to have parallelism constructs. Since I don’t - those things wouldn’t even make sense, not in JS anyway.

I opted for being closer to JS, rather than emulating the SIMT execution model, but it’s a real limitation.

I thought about it, the “right” fix would be to just pull in a full WASM emulator, but then we’d lose ease of debugging and making WGSL look like JS.

My code base is highly modular, I have something like 700 individual WGSL modules in Shade, which in turn import from one another where that makes sense. The point is - it’s highly suitable to unit-testing.

In a project where you only have monolithic shaders it would be a lot less useful I imagine :sweat_smile:

Not to say that I don’t miss having such functionality for myself. I have a lot of barriers in my code, and a lot of subgroup ops. Barriers I don’t actually find hard to understand, but debugging subgroup operations would be incredibly useful.

Sorry to disappoint - but no, Shade is all proprietary and generally treated as closed-source. But I’m completely open to licensing.


As for the point about speed. It’s fast. I added support for fragment shaders recently. Here are sample timings for RCAS (sharpening post process) shader:

1ms is cold-start of the suite and WGSL → JS compilation

1 Like