I’m evaluating moving a CAD/BIM viewer (thousands of separate meshes) from WebGLRenderer to WebGPURenderer, and the unified renderer is consistently slower per frame than the classic one — and the gap is the same whether it runs on its WebGPU backend or its forceWebGL backend, which suggests it’s a renderer-class cost, not a backend one.
Live repro (3 renderers, same scene, side by side): https://codepen.io/S-bastien-GOUBIER/pen/bNBaEor
Same synthetic scene in all three (N separate Meshes over a pool of 200 shared geometries, one trivial material, frustumCulled=false, matrixAutoUpdate=false). Only the renderer/material class differs (MeshNormalNodeMaterial vs MeshNormalMaterial). “CPU render” = time spent inside renderer.render(); “first frame” = the cold first render().
| renderer (4,000 meshes, ~8M tris) | CPU render | first frame (cold) | ≈ max fps (1000/CPU) |
|---|---|---|---|
| WebGLRenderer (classic) | 4.5 ms | 32 ms | ~225 |
| WebGPURenderer (WebGPU backend) | 9.7 ms | 167 ms | ~103 |
| WebGPURenderer({forceWebGL:true}) | 10.3 ms | 344 ms | ~97 |
(The three run on one rAF so their displayed FPS is shared/identical — that’s why I report CPU-render time and a derived ≈ max fps, which are per-render()-call and valid concurrently. Scaling the scene up to ~9k meshes widens the gap further.)
So on this workload the unified renderer costs ~2× more CPU per frame and its first frame is ~5–10× slower — identical on both of its backends.
Questions:
-
Is the ~2× higher per-draw CPU cost of
WebGPURenderervs the classicWebGLRendererexpected (on both backends)? -
The cold first frame is much slower — presumably lazy
RenderObject/ bind-group creation per object. Is there a recommended way to pre-warm or amortize it to improve time-to-first-paint with thousands of meshes?
Environment: three.js r183 (0.183.2), Chrome 148.0.7778.216 (64-bit), Windows 11, RTX 4090 Laptop.