InstanceBufferGeometry Overhead

Hi,

I am working on a three js project which shows highly detailed 3d models. These models contain a lot of parts and are pretty much static.

After some research, I came accross the concept of instancing which seemed very interesting.

I have a system that removes duplicate geometries and figures out the relative transform matrix, effectively reducing the size of the model.

I have the InstanceBufferGeometry system working perfectly, however, there are still a lot of geometries (1500 roughly) which combined with the shader make up about 3000 draw calls.

Itā€™s currently sitting at 30fps roughly on my desktop, however, I need this to be roughly the minimum fps on mobile too. Itā€™s still early stages but Iā€™m starting to wonder if this is the right approach or should I be doing things differently?

Is there a downside from having too many instanced objects?

In terms of performance, would it be best to continue this approach and reduce the triangle count of the geos (decimating for example) or instead reducing the draw calls by combining the geometries that will use similar materials (and avoiding instancing)?

I hope this makes senseā€¦ I only just started to get into realtime development a few months ago and still getting my head around several concepts. Happy to provide more info if it helps make more sense of this :slightly_smiling_face:

Thanks!
Ant

2 Likes

In my limited experience, instancing works amazing for very low poly geometry (grass in my case). Iā€™ve attempted it with more complex geometry and it seems to be not much more efficient than without instancing, but not sure in your exact case. Give it a try =]

2 Likes

Thanks for your input. Good to know, my geometry is pretty complex at the moment so wondering if I next focus should be on simplifying it or on something else?

I will likely continue this approach a little further but Iā€™m a little lost as to where to optimize it. Currenttly, it draws fine but with a basic shader - I will have to feed envmaps for reflections and a much more interesting shader to be certain Iā€™m doing whatā€™s best.

Currently, the only obvious benefit from this is a smaller total file size due to less geometries.

Since everything needs to be loaded only once, the other idea I had was to use to use the same files, but make each instance a Buffergeometry, and merge them based on common materials. This would reduce the draw calls.

Currently, there are many different instances that will get the same material treatment.

You can load the geometry once and use it on as many separate meshes that you want. This will reduce memory overhead.

Also, if those meshes are non dynamic you can merge them into a single geometry to save draw calls. However, there are drawbacks to this as each individual object is no longer frustum culled which brings additional overhead (can be significant). Same situation with instancing, no frustum culling advantage unless you split the instances/merged geometry into ā€œchunksā€. I use an 8x8 grid typically to do this within my scene.

Results can vary so it requires some case by case testing. If object counts are less than a hundred or so (at least in my app) I do not bother with instancing/merging.

1 Like

Frustum Culling isnā€™t something Iā€™m worried about as there will almost never be any need for that. Think of it like a configurator, the objects are always on screen and we just rotate around it.

Since my app consists of roughly 1500 geometries which will be instanced as 80% have at least one duplicate (some have as much as 20 duplicates) I guess this is a very particular case and like you say will have to do with trial and error.

I will start by attempting to improve the triangle count first (as that seems like the easiest thing to try first. And if that doesnā€™t make much difference, I will look into the draw calls.

Thanks for the input!

1 Like

If your application is CPU bound, lowering the poly count wonā€™t solve your performance problem. I would try to reduce the amount of draw calls first. 3000 draw calls is quite a lot and probably the current root cause of your issue.

1 Like

Thanks for the tip! I was under the impression that the number of draw calls could be too muchā€¦

However, excuse my ignorance, but how do I know if itā€™s CPU bound? I was under the impression that once loaded, all the calculations were performed in the vertex shader and that that happened at the GPU?

Even though it may depend on many other things, would you say that it may be more performant to use merged BufferGeometries (where I can merge all the different instances with shared materials) instead if that reduces the overall draw calls?

As I saidā€¦ ā€œExcuse my ignoranceā€. I went too fast with my response.

I just did a quick search (which I should have done before responding) and realized my mistake. The draw call is the command that the CPU sends to the GPU to draw.

I wasnā€™t aware that the CPU would be doing those calls constantly. I see how the bottleneck now could be coming from the CPU then. Something new I learn today!

3 Likes

Thereā€™s plenty of tricks for testing whether itā€™s CPU or GPU bound, but one of the easiest is setting scene.overrideMaterial = new MeshBasicMaterial(). If your FPS improves, itā€™s GPU bound, if it remains the same, itā€™s the CPU.

3 Likes

@looeee
Neat! I will surely put that on my little cheat sheet! Thanks!

1 Like

No worries, Iā€™ve been making a little cheat sheet of my own over the last while, let me know if you have anything to add:

2 Likes

Nice list! Bookmarked :smiley:

As I progress with this, I will surely let you know if I come across anything worth adding!

1 Like


100000l is my favorite number, maybe 100000l; though? :stuck_out_tongue:

3 Likes

Hey, itā€™s a looong number :stuck_out_tongue:
I fixed that ages ago actually, I just havenā€™t pushed it to live yet.

Iā€™m sure no one here was loosing any sleep because of this but I thought Iā€™d share my findings for reference.

Originally, I was instancing a lot of geometries (roughly 1600) into a varied number of instancesā€¦ This built a mesh of 4000 geometries (which are all instances of the original). After the tips I got from you guys, I came to realize that my bottleneck here was the number of draw calls. So, in an attempt to reduce them I did the following:

Instead of InstancedBufferGeometry, I loaded each instanced as a BufferGeometry. All these ā€˜clonesā€™ were then merged as one bufferbeo (letā€™s call it MergedBufferGeo). Then, I went one step further and found all MergedBufferGeos that were going to share the same material and merged those too. This reduced the draw calls down to 27. A huge difference from 1600 haha.

The FPS count now is back at the top. Which is amazing.

One thing that worries me though if this is going to cause different set of issues? Itā€™s performing great (on PC that is) but I wonder if this approach has a heavier memory footprint?

Anyway, this was a great learning exercise! Thanks all!

1 Like

Hmmā€¦ if using instances you should have only one draw call per mesh (unless using multiple materials). And in your situations it sounds like you arenā€™t splitting up the scene into a grid of meshes for frustum culling reasons. There should only be one draw call in this situation per geometry (unless your instancedBufferGeometry is using the .groups property for using multiple materials for example (each group member gets its own draw call).

Yea the merging method is definitely far more memory intensive than instancing in my experience. For my grass for example, merging was using about 600MB, whereas instancing used around 15MB or so.

Instancing takes a bit to set up properly, below is some example js code that I basically use. After this you need to alter the material shader code as well.

var geo = your original loaded geometry;

var instancePositions = [];
var instanceQuaternions = [];
var instanceScales = [];

var positions = /// some array of vectors

var quaternion = new THREE.Quaternion();

var x,y,z;

for( var i=0, len=positions.length; i<len; i++ ){

	x = positions[i].x;
	y = positions[i].y;
	z = positions[i].z;

	quaternion.setFromAxisAngle(your "up" vector, some_rotation );

	quaternion.normalize();

	instancePositions.push( positions[i].x, positions[i].y, positions[i].z );
	
	instanceQuaternions.push( quaternion.x, quaternion.y, quaternion.z, quaternion.w );
	
	instanceScales.push( 1, 1, 1 ); // or whatever scale you want
	
}

var instancedGeometry = new THREE.InstancedBufferGeometry();

instancedGeometry.attributes.position = geo.attributes.position;
instancedGeometry.attributes.uv = geo.attributes.uv;
instancedGeometry.attributes.normal = geo.attributes.normal;
instancedGeometry.index = geo.index; // if needed
instancedGeometry.groups = geo.groups; // if needed

instancedGeometry.addAttribute( 'instancePosition', new THREE.InstancedBufferAttribute( new Float32Array( instancePositions ), 3 ) );
instancedGeometry.addAttribute( 'instanceQuaternion', new THREE.InstancedBufferAttribute( new Float32Array( instanceQuaternions ), 4 ) );
instancedGeometry.addAttribute( 'instanceScale', new THREE.InstancedBufferAttribute( new Float32Array( instanceScales ), 3 ) );

var mesh = new THREE.Mesh(instancedGeometry,some material);

scene.add(mesh);

Thanks for the feedback.

I did set up my instancebuffergeos properly (and everything shows up properly), however my scene is a little complex so this is becoming a bit of a learning and optimization/performance exercise.

Iā€™m using three js to visualize a CAD model. This model has a lot of duplicate geo. I have a system that figures out the duplicates and the matrix transformations required to place them in their right place.

The model has been decimated to a certain extent. I can still be decimated further for sure, so itā€™s something I take into consideration too.

To expand on my earlier post. This is an oversimplified example:

One of the geos are metallic screw bolts, there are 5 of these, there are another 4 metallic rods and 3 plastic handles (again, oversimplified example).

On that example. The instancedbuffergeo approach, would have 1 of each and instanced with the shader, but has 3 draw calls. But the merged buffer geo one, has all geos merged by material (so screws and rods are merged too) and reduces the draw calls to 2 in this example.

This is how I went from 1500 draw calls to 27 (because Iā€™m only using 27 materials).

Since this is more memory intensive, this is causing another set of problems on lower spec devices (mobile).

So learning while trying to find that sweet spot.

1 Like

Ahh I see. Best of luck =]

hello looeee. Itā€™s great to see your article about threejs. But I am confused about this

Shadows

  1. If your scene is static, only update the shadow map when something changes, rather than every frame

How to achieve this ? when you set the castShadow to false. the shadow map just disappear. Could you give me some example about this. I need your help.

1 Like

Please make a new thread if you want to ask for help on a different topic.