I’ve been experimenting with global illumination in Three.js using WebGPU. I tried the node-based SSGI approach, but it becomes too expensive for large open-world scenes.
I’m currently building a small engine with Three.js + Electron, and I’m exploring alternatives like Global SDF + cone tracing for approximate GI.
I wanted to ask:
Do you see a future direction in Three.js/WebGPU for scalable GI (beyond screen-space techniques)?
For example, something like probe grids, SDF-based tracing, or voxel approaches integrated into the renderer?
I’d really appreciate any thoughts or guidance on what direction makes the most sense within the Three.js ecosystem.
At this moment, the closest thing is the newly added Light Probes Volume, which provides diffuse GI via L1 Spherical Harmonics probe grid:
However, it is not yet available for WebGPU as it’s experimental, but a WebGPURenderer is in the list for a few releases more, as Michael explains here.
The Light Probes Volume looks promising for diffuse GI, but I’m wondering how well it scales for dynamic/open-world scenes.
I’m currently experimenting with a Global SDF + cone tracing approach in WebGPU for more dynamic lighting. Do you think probe volumes could work alongside something like that, or is it better to focus fully on SDF/voxel-based methods for this use case?
I tried the node-based SSGI approach, but it becomes too expensive for large open-world scenes.
To be honest, in theory, SSGI 'should’ be independent of the scene’s complexity, but I can see some limitations if you try to apply it to open-world environments. The official node-based example is a good starting point for the community, although it doesn’t exactly seem fast, and overall I think that must be a limitation of the technique itself (the 0beqz implementation alone was already making my laptop’s fan spin like crazy). I suppose that’s why, in practice, this should be articulated with another technique.
It could be interesting to try combining them -something like using probes for basic diffuse lighting and cone tracing for dynamic/local refinement. Optimizing the SDF is crucial here, especially if you’re going the open-world route. I’m not sure how quickly the current implementation of Probes Volume can “bake” the data. I understand that it’s all handled on the GPU side, but if the geometry is dynamic, it wouldn’t be unusual to end up reserving a pooling or queuing scheme for adjacent areas to bake them in advance. Ideally, the spatial partitioning/indexing system should share the same data structure that you use for the probes.
In my case, I’ve been working for quite some time on building a framework for Archviz (not a big fan of the “engine” term), which I hope to release here in the medium term. Early in development, I realized that the spatial structure needed to be shared by the lighting techniques, and I later extended this to occlusion culling, collisions… and even spatial audio. This seemed like just another optimization at first, but it eventually became a foundational element for the entire framework, its components, and the organization of the code.
I’m not sure we have a completely clear approach for implementing a GI solution, or if it’s even on the roadmap. WebGPU is still taking shape, and for now, what we’re seeing is people extending it with custom solutions -that’s what Alexander has been doing with Shade (a cone-tracing solution implemented on WebGPU with a SDF resolution of 512x504x512), for example. I think that’s a more concrete approach than waiting for official library support; I could be wrong, but it would be great to hear what others here think.
So I’ve been experimenting further with this approach and made an example to see how it holds up in a scene like Sponza. Here’s a quick breakdown of what I’ve put together:
To handle the scene complexity without heavy ray-tracing, I’m using a voxelized proxy of the scene. It basically creates a 3D grid of color and occupancy that the GPU can march through very fast.
The lighting itself lives in a grid of light probes using Spherical Harmonics (level 2). I’m running a compute shader every frame that shoots rays out from these probes into the voxel grid to calculate both direct light and shadows.
One thing that really helped with performance was adding temporal accumulation. Instead of shooting hundreds of rays at once, I shoot a few jittered rays per frame and smooth the results over time. This also lets me calculate multi-bounce GI by having the probes sample each other during the bake.
On the material side, I’m interpolating the probes at each pixel and added some weight crushing based on the surface normal. This stops light from leaking through thin walls by ignoring probes that are technically “behind” the surface.
It’s been a fun challenge getting this to run smoothly in WebGPU!
And the results are
Looks good, and you’re definitely doing right things:
Doing ray tracing on full geometry is too heavy without hardware RTX support generally, so that’s a right call for sure, also - scalable. Working with a full accelerated structure using triangles also involves a huge amount of engineering effort.
A grid/3d-texture is a good practical trade. I’ve gone with tetrahedral meshes, and BVHs and a few other things in the past, but being able to get to your proves in O(1) time is very valuable on the GPU, can’t beat a grid for that.
L2 is maybe an overkill for diffuse-only, you pay for 9 coefficients instead of just 4 of L1, and your ALU load more than doubles. I would stick with L1 probably for this. Visually you’ll have little difference, but you also will not have to worry about ringing as much, and your perf will essentially double.
Pretty much essential, you can’t afford more than a handful of ways per probe each frame, so you have to accumulate temporally. One thing to note here is that choosing where to shoot rays and what blend factor to use are hard questions. I’ve done accumulation on L2 probes in the past, but it’s hard. If you accumulate - octahedral texture is a better choice, even if it’s just 4x4 pixels. That would be 16 pixels, versus your 9 coefficients of L2, not too far in terms of storage. You then resample octahedral form to L2 (or L1). But accumulation in spherical harmonic form is something I don’t recommend.
This blurs your signal, so you have to be a bit careful about how you do this, but it’s pretty standard approach. You get infinitely faster convergence this way, and it’s a 100% correct trade to take, almost always.
Weighing probes is an art, normal-based masking is quite a common technique too, DDGI is probably the most well-known example of this, as that technique has been the basis of many modern probe-based solutions.
Essentially it’s visibility encoding. If a probe is behind the surface - it couldn’t be visible, and therefore couldn’t contribute to the surface.
This is probably the single most useful trick, because it doesn’t require any new data. You know where the probe is, where the pixel is in relation to it, and what the normal is. “Crushing” versus standard masking is the right choice too, you don’t want to have a discrete visibility function, continuity helps hide cases where this test fails.
I would recommend a one more thing that is easy to add:
check if your probe is trapped just behind a surface, or is lying just on top of it. Those are bad locations to shoot rays from, push it out, preferably forward a bit. This is the location you will use for shooting rays, you can safely keep using the fixed grid location for sampling. This alone fixed majority of nasty artifacts you might get.
Thanks a lot for the detailed feedback! It’s rare to get such high-quality architectural advice ,it’s clear you’ve spent a lot of time in the GI trenches.
You were spot on about the probes getting trapped behind surfaces. That was exactly what was causing my ‘black spot’ artifacts. I’ve now implemented a gradient-based probe offsetting system. It samples the voxel neighborhood during the bake to find the direction towards empty space and pushes the ray origin out before shooting. It has made a night-and-day difference in light-leak stability.
Regarding the SH L2 vs L1 point: I decided to stick with L2 for a specific reason—I wanted to implement Glossy Indirect Specular. By using the extra directional detail in the L2 coefficients and sampling with the reflection vector (plus a Fresnel/Roughness approximation), I’m now getting some really nice soft reflections on metallic surfaces that were just too ‘flat’ with L1. For my use case, the extra ALU cost feels worth it for that ‘premium’ look!
I’m still doing temporal accumulation directly in SH space for now for simplicity, but I’ve noted your warning about ringing and variance. If I hit a wall with stability during fast lighting changes, switching to an Octahedral intermediate texture is definitely next on my roadmap.
Really appreciate the DDGI-style tips. It’s helping me turn this experiment into a much more robust system!