There are 3 ways:
- Cache nearby probes on the mesh
- Cache probes in screen space
- Cache probes in world space
To make this make sense, let’s step back for a second and understand what an “irradiance volume” is.
We have lightmaps, which encode irradiance on the surface. That is - given a point on the surface of a mesh - we store light contribution.
Volumes are difference, we store irradiance for a point in space instead of at a surface.
Volumes have the benefit of not caring about the surfaces. That is - we can resample them for novel things and irradiance will be correct… -ish, the larger the things we introduce in to a baked irradiance volume, the more “wrong” the volume will become.
Volume also doesn’t care about number of meshes or their complexity. Conversely, a light map will need some space for each mesh, and vertex-encoded lightight will need to store sode data for each vertex of each mesh instance.
So, back to caching. Irradiance volumes are cool for the their property of interpolation. When we “cache” we want to cache a volume “cell”, not just a single sample. If our volume is a tetrahedral mesh - a single cell will point to 4 probes (making up a tetrahedron). If your volume is some kind of a regular grid (3d texture), a cell will be nearest voxel ID of the grid.
If your volume is a grid - I’d say just don’t bother to cache, as computing the cell is trivial.
For tetrahedral mesh, we can store nearest cell ID at some level of granularity. We can store 1 cell for the entire instance, or we can store nearest cell on per-vertex basis. Note that whatever you choose, you have to consider that you need to update that cache when mesh moves or animates. If you don’t - the useful of the cache will deteriorate and will offer less and less benefit.
How do we go from our cached cell to actual set of probes when shading a texel? - We start from the cached cell, and a point in world space, and we walk our tetrahedral mesh until we land in a cell that definitely contains our world point.
This is fairly trivial, Unity has a good explanation on this topic.
- You essentially compute barycentrics for a tetrahedron
- If barycentrics are all positive (uvwc) an they add up to 1 - you’re in the right cell
- If your barycentrics are off, move to the neighbour that lies in the direction of barycentric vector
- Repeat process until you arrive at correct cell, or reach some number of steps, in which case you give up
This seems dumb, but it works really well, and you typically find the right cell in just a handful of steps given a cached starting point.
there are 3 ways here as well:
- We can resolve at the mesh origin, that’s just 1 point for the entire mesh
- We can resolve in vertex space, and pass resolved value onto pixel shader. We can be a bit more clever here and resolve in some kind of directional format, such as spherical harmonics, which we can interpolate in the pixel shader to capture extra directional information.
- We can resolve in pixel shader.
You can probably find it easy to believe that 1. is fastest, as we do only 1 resolve, but is also least accurate and that 3. is the slowest but captures stored volume information perfectly.
The screen-space caching is basically just doing a very low-resolution render pass with your depth buffer and finding nearest volume cells for each pixel. These get you quite far genrally.
If you don’t have a deferred renderign pipeline, you can also do this at the end of the frame and accept 1 frame discrepancy, sampling your cache from previous frame. It’s pretty good most of the time.
3d caching is basically the same as screen-space, but instead of building a 2d texture, you build a 3d texture in frustum space. A “froxel” texture if you will. This one has an advantage of capturing information for things that don’t have depth, such as volumentric effects, particles and transparent/translucent surfaces.
I didn’t bother implementing mesh-level caches, but I’ve tried all of the other options.
I also made a BVH-based lookup, where each cell is given an AABB and we use a BVH to do resolution, this make resolution O(log(n))
instead of linear. This is what I’m currently using as it appears to offer the best scalability. Your mileage may vary though.