Reducing shader compile time on scene initialization

I’m rendering a room interior with around 20 mesh objects and around 30 unique materials.
On top of that I’m enabling shadows.
The result is an enormous initialization wait of about 3.5 seconds on my machine, but a less powerful machine can take considerably longer.

The scene is loaded from a GTLF, at first I thought the long wait was the fault of the draco decompression working, but a quick inspection revealed it was more likely the amount of MeshPhysicalMaterial with textures being compiled:

Each material contains usually 3 texture maps: color, normal and roughness. Not sure if that’s important.

It’s vital that the app can load quickly. What can I do to reduce the initialization time?

The model should not be bigger than 2-3mb, textures should be 1-2k, you can/box atlas map all materials so that you only have 3 textures for the whole model.

Look into gltf-transform first, see if that helps, atlas’ing is done in blender.

1 Like

My model meshes are quite highly optimized, an entire scene with many complex objects amounts to around 50,000 tris.
The majority of textures are 512 or 256 width / height squares, some 1k textures for floor textures.
I’m converting my textures to WEBP format which also cuts down texture sizes substantially.
And then the resulting GLTF is piped into gltf-transform and I get my compressed GLB.

The bottleneck appears to be just the number of individual materials with textures. Everything else I think is pretty well optimized.

Ideally I want to convert my scene into 3 objects / materials:

  • floor (material with 1 large seamless repeating texture)
  • walls (material with no texture)
  • objects (all other meshes combined, with baked materials, texture atlas maps)

Is there a blender addon you or anyone else could recommend that can combine a load of objects / materials into a few maps?
Or perhaps a program that can process GLTF models and produce single material objects in the same way?

3.5 seconds seems like a lot to compile 30 materials – this might be a browser issue.

The same issue presented itself to several other users testing the app on various browsers so I don’t think so.

For me, there’s also a period after it’s loaded where it can be a bit choppy, freezing occasionally, before finally running smoothly. Perhaps due to the camera culling certain objects out of frame, and when they appear in frame for the first time shaders are compiled.
If I’m watching a video on youtube in another window I’ve noticed the video freeze during this process.
If I use environment maps the delay while compiling shaders is even longer.
And if I turn off soft light shadows then it’s a bit quicker to compile those shaders.

I figured three.js has to compile a new shader every time a PBR material is rendered using a different set of textures. And it appears that the more textures there are, the longer it takes.

Yes, three.js may have to compile multiple shader programs from the same base type of material, depending on the parameters given to the material. 3.5s for 30 materials still sounds unusually slow to me, suggestive of a problem and not normal timing. And so I’m hesitant to guess what’s going on or give advice, without a way to reproduce the issue.

And it appears that the more textures there are, the longer it takes.

I don’t see it in your screenshot above, but uploading textures to the GPU is often a huge initialization cost. This is a function of image resolution, not file size – a single 4K JPEG texture is around 90MB of data sent to the GPU, regardless of size on disk.

If you get a heavy program compilation delay you might check if your materails are setting properties that is passed as defintion/constant to the shader.

The amount of materials isn’t an issue, it’s about what features these use (which maps for instance) and attributes like alphaTest - if you would use a different alphaTest value on 3 materials, it would result in 3 programs being compiled. Programs are shared by the materials depending on their key (code/features used) A material using just albedo, while another uses albedo + normal + roughness you got 2 different programs, another using just albedo but also transparent will be another.

If you want to avoid different program setups you could set all map properties you use on all materials to null on those not used or a empty texture. And of course ensure there are no different constants setup, this way you reduce the programs being compiled to a minimum.

Try renderer.debug.checkShaderErrors = false

I tried renderer.debug.checkShaderErrors = false, no noticeable improvement.

Prior to rendering a loaded scene, I’ve been playing around with materials, replacing and simplifying and observing the shader compile time.
I’ve managed to reduce the shader compile time of my most complex time down to around 0.85s.
I’ve come to some conclusions:

  • MeshStandardMaterial shaders compile a fair bit quicker than MeshPhysicalMaterial, due to the relative simplicity of the scene possible I’ve replaced all MeshPhysicalMaterials with MeshStandardMaterial.
  • MeshPhysicalMaterial.transmission comes at a huge performance cost. I had 2 very small glass objects in my room with an assigned transmission and it resulted in doubling the total scene’s shader compile time.
  • Calling THREE.WebGLRenderer.compile() followed by THREE.WebGLRenderer.render() outside of the animation loop appeared to help a little, rather than having it happen automatically in the animation loop.

I tried what Fyrestar suggested but I couldn’t get it to compile any less shaders.
Even after replacing all the materials with MeshStandardMaterials, I was traversing the scene for all materials and doing something like this:

gltf.scenes.forEach(s=>s.traverse((o)=>{
  if (!o.material) return;
  var m = o.material;
  m.map = m.map || null;
  m.aoMap = m.aoMap || null;
  m.normalMap = m.normalMap || null;
  m.emissiveMap = m.emissiveMap || null;
  m.roughnessMap = m.roughnessMap || null;
  m.metalnessMap = m.metalnessMap || null;
  m.alphaMap = m.alphaMap || null;
  m.defines = {};
}))

I was trying to make every material use the same shader (assuming setting to null instead of undefined made a difference) and removing the defines property, but I could not achieve this without removing all material texture maps and changing almost all the parameters to the same value. Even then it was still a bit slower than just setting all object materials to explicitly the same material, which is almost instantaneous.

I’m still curious about 2 things:

  • Is it possible to compile shaders in a separate thread so it doesn’t affect the animation loop? I’d like to explore the possibility of having the shaders compile ‘offscreen’ as it were, maybe replacing placeholder materials in the scene once the affiliated shader is compiled.
    Currently there is an unavoidable pause in the animation loop and it would be great if I could avoid that.
  • Still wondering if there is a nifty tool for texture atlas mapping, so I can bake all my materials (except floor) into a few texture maps (probably 2k each?)
    Not only would that lower amount of shaders but it would also massively reduce the draw count.

Many thanks for all the help and suggestions so far. Very happy I chose Three.js over other libraries.

I’ve attached 2 versions of a scene below (compressed & uncompressed) for others to check out:

lounge_compressed.glb (1.2 MB)
lounge.glb (3.6 MB)

1 Like

Unfortunately not currently. See Implement KHR_parallel_shader_compile support · Issue #16321 · mrdoob/three.js · GitHub.

Still wondering if there is a nifty tool for texture atlas mapping…

Maybe RapidCompact, or the SimpleBake addon for Blender? I haven’t done much with texture atlases myself though.

Thanks @donmccurdy. SimpleBake looks good, I will consider purchasing that, but I might have a go at coding something myself first.

I was confused what @Fyrestar meant by “ensure there are no different constants setup”.
At first I thought it meant the defines parameter, later I considered it could mean all the numeric values like roughness, metalness, etc.

Following that logic then:
new MeshStandardMaterial({color:"#ffffff", metalness:1})
and
new MeshStandardMaterial({color:"#ffffff", metalness:0.99})
would produce 2 different shader programs if assigned to 2 meshes. Is that correct?

But what about simply:
new MeshStandardMaterial({color:"#ffffff"})
and
new MeshStandardMaterial({color:"#eeeeee"})

Would this produce 2 separate shader programs?
I assumed all of these parameters were given to a single shader program during the rendering phase.

Or is it just specific material parameters (like all the texture params) that prompt a new shader program to be generated. If so, could someone provide a list?

If these points could be clarified it would be very helpful.

I decided to try another browser.
What takes 0.8 seconds in Chrome takes 5 times as long in Firefox:

Chrome was “unusually slow” to load 30 materials in 3.5secs before I optimized it.
Firefox took about 4 seconds to initialize the shaders after optimization.
Any idea why there’s such a large discrepancy between browsers?

It’s not about uniforms, only constants (defines or values that are setup like defines, such as alphaTest)

Still looking for answers regarding the renderer.compile duration difference between chrome and firefox.

When my 3D app is loading a spinner is visible to indicate loading.
In chrome the spinner stutters immediately when the shaders start compiling, indicating it’s doing some heavy processing the moment renderer.compile is called, interrupting the animation.
In firefox the spinner remains spinning consistently for about 3 seconds, and then it stutters.
This would appear to indicate firefox is doing things a bit more lazily. I don’t suppose there is some sure-fire way to ensure firefox doesn’t dawdle like it appears to be doing?

three.js r158 will support KHR_parallel_shader_compile to compile shaders off the main thread:

I’m not aware of any way to influence Firefox shader compile times otherwise, that’s a black box to me. If a browser’s WebGL implementation is particularly slow compared to others, it’s usually necessary to file a bug with the browser.

6 Likes

Started a new thread with a link to the app that is now live:

@dllb is reporting some very troubling statistics though. For them it takes a whopping 15-22 seconds in firefox to load the scene after downloading.

Since last posting on this thread I managed to get it down to around 2 seconds on Firefox and 0.5 on Chrome by following @Fyrestar’s advice and essentially make it so every material uses the same constants, resulting in far fewer shaders, I run the following code on the gltf resource before adding to the scene:

var create_color_texture = (col)=>{
  /** @type {HTMLCanvasElement} */
  var canvas = document.createElement("CANVAS");
  canvas.width = canvas.height = 2;
  /** @type {CanvasRenderingContext2D } */
  var ctx = canvas.getContext("2d");
  ctx.fillStyle = col;
  ctx.fillRect(0,0,canvas.width, canvas.height);
  var tex = new THREE.CanvasTexture(canvas);
  tex.format = THREE.RGBAFormat;
  tex.wrapS = THREE.ClampToEdgeWrapping;
  tex.wrapT = THREE.ClampToEdgeWrapping;
  return tex;
};
var black_tex =  create_color_texture("black");
var white_tex =  create_color_texture("white");
var normal_tex = create_color_texture("#8080ff");

/** @param {THREE.ColorRepresentation} c */
var is_black = (c)=>{
  return c.r == 0 && c.g == 0 && c.b == 0;
}

/** @param {THREE.MeshPhysicalMaterialParameters} m */
var get_material = (m)=>{
  /** @type {THREE.MeshPhysicalMaterialParameters} */
  var props = {
    color: m.color,
    map: m.map || white_tex,
    normalMap: m.normalMap || normal_tex,
    normalScale: m.normalScale,
    emissive: m.emissive,
    emissiveMap: m.emissiveMap || (is_black(m.emissive) ? black_tex : white_tex),
    emissiveIntensity: m.emissiveIntensity,
    roughnessMap: m.roughnessMap || white_tex,
    roughness: m.roughness,
    metalnessMap: m.metalnessMap || white_tex,
    metalness: m.metalness,
    envMap: this.env_map || null,
    envMapIntensity: m.metalness ? 1 : 0,
    side: THREE.DoubleSide,
    transparent: m.transparent,
    opacity: m.opacity,
    vertexColors: true,
    flatShading:false,
  };
  var hash = JSON.stringify(Object.values(props).map(v=>{
    if (v && v.uuid) return v.uuid
    return v;
  }));
  if (!new_materials[hash]) {
    var type = (props.transmission) ? THREE.MeshPhysicalMaterial : THREE.MeshStandardMaterial;
    new_materials[hash] = new type(props);
  }
  return new_materials[hash];
}

gltf.scenes.forEach(s=>s.traverse((o)=>{
  if (o.material) {
    o.material = get_material(o.material);
  }
  if (o.geometry) {
    /** @type {THREE.BufferGeometry} */
    var geom = o.geometry;
    var attrib = geom.getAttribute('color');
    if (!attrib) {
      geom.setAttribute("color", new THREE.BufferAttribute(new Float32Array(3 * geom.attributes.position.count).fill(1), 3));
    }
  }
}));

On every machine I’ve tested it on since optimizing the shader compilation it’s been initializing much faster, so what’s going on with @dllb I have no idea.

Obviously what I mentioned from the start:

Javascript has to be compiled first before run, and all objects loaded and processed, so performance will suffer on a weak PC which obviously is the main cause here, as the internet line is quite fast.

The CPU is core2 duo, (old) so it is obviously the exception, not the rule.
I believe that not many people still use a Core2 Duo (I searched for statistics but I couldn’t find any free ones).
I keep it in order to keep optimizing the code, as it is still more powerful (per core) than my recent mid-powerful cellphone. Otherwise I would be spoiled to approve almost anything I code at first try!

On the second computer which is recent and powerful, the loading times were very short as you saw on the profile I uploaded for you.

OK, that’s good to know.
When you said ‘low-range’ I didn’t figure it was quite that old and low-range.
You’ve been very helpful as it is, but would it be possible to record the low-range PC’s performance again monitoring the main process? As I said in the other thread, it doesn’t appear to have recorded any function calls relating to the app itself so it’s difficult to know where to focus my efforts to optimize.
I’m currently re-working the generation of the wall stickers, which are currently SVGs converted to meshes on-the-fly.
I suspect your machine was probably struggling during the whole SVGLoader phase, but it would be very helpful to see what portion of time was spent for each phase:

  • converting the sticker SVG to mesh
  • draco decompression
  • GLTF processing
  • scene initialization
  • materials/shaders

Many thanks.

Had an issue with compiling causing delays earlier today, so upgraded THREE to r158 to use compileAsync. However, this only seems to work properly in Safari; all other browsers throw a THREE warning "KHR_parallel_shader_compile not supported". Which is weird as MDN and caniuse both state support in all but Firefox.
Noticed there is a workaround loop in the compileAsync() promise, but that also does not seem to work properly - it instantly resolves.

Is this a known issue or am I doing something wrong?

Typically with WebGL extensions, support from the browser is necessary but not sufficient. You also need support in the GPU, drivers, and operating system (exactly which of these, I’m not sure…). See WebGL Report to learn which extensions your device really supports.

But if the workaround in compileAsync fails, that sounds like a big — I would encourage filing an issue or a PR on the three.js repo if that is still happening on the newest three.js release!