Ridiculous number of boxes

Hey guys, here’s a demo I put together a while back:
http://server1.lazy-kitty.com/tests/instanced-foliage-10mil/

image

on my hardware it’s clocking about 3ms per frame render, which is roughly 330 fps :running_woman:

As the name implies, the demo generates 10,000,000 cubes (slowly) and populates the scene. The note-worthy part is that you can view all of that at interactive frame-rate through the magic of culling. How do you cull 10,000,000 objects at an interactive frame-rate? Using a spatial index :slight_smile:

Anyway, it’s just a tech demo. The application of this tech is in my game engine for instanced meshes.

This is an oldie, here’s a topic on github with some further discussion around it:

5 Likes

Related: Dynamic instances

Also check out the work @takahirox has been posting on twitter the last few days:

1 Like

Looks really cool :slight_smile:

The animation demo runs super-slow though, I wonder why. I use the onBeforeCompile to patch in transform code in my stuff, so it plays well with all materials, shadows did take a bit to get to work though.

1 Like

There’s another InstancedMesh on npm, maybe add to the mix:

Also I’d like to add that he demo I sent includes intersection detection from a mouse on Desktop devices and from a crosshair on Mobile devices. So one of the reasons I wanted to implement this demo is to show how it is possible to have fast intersection detection (even for instances) using binning algorithms and WebWorkers. AFAIK THREE.js does not support collision detection for Instanced geometries.

1 Like

Hey @oguzeroglu, I understand what you mean about intersection detection. This demo here is essentially the same - camera is modeled as a frustum, and you essentially have to do an intersection test between said frustum and 10,000,000 objects each frame, so speed is pretty important. I don’t use a worker here, since it’s already fast enough for the demo, but I could see the appeal.

Hello Alex.

It is really impressive.

I am working in a similar problem but instead working with standard meshes I am working with instanced meshes.

In order to make my own frustum culling system I have created a web worker that works with the bounding boxes of instances.

Anyway it would be wonderful to have a non unglified example to see how you implemented your frustum culling system based on spacial index.

Best regards

Hello Alex,

I don’t think you would find much value in seeing the code. The demo does make use of instanced meshes. The spatial index is a BVH. I use the same solution for my game, there it’s used for things like trees, flowers and other frequently occurring meshes. That being said, I have provided a bit of code in github discussion thread:

here are examples of this solution being used in my game:


My code is not really special in what it does. It’s special in how it does things, i.e. it’s a highly optimized version of a standard solution.

best of luck,

  • Alejandro

Hi Alex.

My scene comes from a really big CATProduct with 100k different assets and about 1M instances in total.

The tree structure of my scene follows the same structure that designers used to make the assembly.

My question is that if you think your solution will work better than mine.

I don’t really understand how BHV works neither how to use it in my scene. Does your solution only works with frustum or also takes in mind size of instances and clipping planes and visibility of scene instances.

Let me explain my approach with more details and please if you think I am doing anything wrongly, tell me.

My idea consist in making the scene using the different assets and after make the instances, here it comes the first problem. I am talking about loading 100K meshes in memory. In order to load only visible instances I previously compute when loading my assets the bounding boxes of all elements in my tree including parents and leafs. By now I only use leafs and I use a brute brute-force algorithm.

The parameters I take in mind are:

  • Instanced Buffer Attribute Visible: passed to the web worker to know which are the visible instances
  • Camera position and Camera Matrices: passed to the web worker to know distances between camera and instances
  • LOD Size of each instance based on distance from camera to bounding box and also based on bounding box of the instance: passed to the web worker to know if instance is big enough to be visible
  • Clipping planes: Passed to the web worker to know which instances are not clipped
  • Current Frustum of my camera: passed to the web worker in order to compute in real time which are the bounding boxes of my instances which are in the frustum
    All this information is passed to a web worker via typed arrays in order the web worker will do computations without interfere in rendering process
    When I change any of these attributes:
  • Visibility of any instance
  • Clipping planes of camera
  • Position of my camera
    The web worker precomputes which are the visible instances:
  • Instances which instanced attribute visible == 1
  • Instances which distance from camera is near enough to be visible (based on AABB size of instance)
  • Instances not clipped by clipping planes of camera
  • Instances which are in Frustum of my camera
    And send the information again in a typed array to the render process.

The algorithms to make these computations work really fast in the web worker being able to pass the results to the main process each about 40-50ms and in a really async mode (parallel process which doesn’t impact in main thread). I am working with a simulated environment with 1M of instances.

My bottleneck is I think is the process to re-addapt my scene due to the fact I have to overwrite all instance attributes each time I have a change. I have to iterate my scene tree and change values:
> Scene:

           Object3D1
              Mesh
                 BufferGeometry
                     Attributes
                         Instanced attribute visible
                         Instanced attribute matrix (passed as 4 vec3)
                         Instanced attribute normal matrix (passed as 3 vec3)
                         Instanced attribute color
                         Instanced attribute selected
              LineSegments (optional)
                 BufferGeometry
                     Attributes
                         Instanced attribute visible
                         Instanced attribute matrix
                         Instanced attribute normal matrix
                         Instanced attribute color
                         Instanced attribute selected
              Points (optional)
                 BufferGeometry
                     Attributes
                         Instanced attribute visible
                         Instanced attribute matrix
                         Instanced attribute normal matrix
                         Instanced attribute color
                         Instanced attribute selected
           Object3D2
           …
           Object3Dn

I hope this will clarify my problem and also my previous questions.

Best regards

My instancing system does no deal with asset loading at all. I have a relatively small number of assets, so I don’t defer loading them, at least not the meshes, so I don’t think what we’re doing is similar in that respect.

In that example only the camera frustum is used for culling, not sure which other clipping planes other than those of the camera itself you mean.

I do not use any LOD system in my engine, there is no need for that in my usecase.

With respect to performance, what you see is all done in a single main thread, so querying all objects and updating the instanced mesh takes about 2.3ms on my hardware. Render takes about 0.7ms.

I’m not sure what you mean when you talk about your bounding boxes, I use AABBs, and they are created using geometry AABB, said AABB is then transformed using rotation,scale,translation to get AABB for each instance, those AABBs are stored in a Bounding Volume Hierarchy. The hierarchy itself is unoptimized in this demo, I do have an pretty advanced incremental optimizer for it, it runs somewhat slowly though, so I didn’t include it, but running it should improve the performance by ~30% based on my experience, so querying and building might drop to ~1.6ms.

Then again, this is a technical demo, there is only 1 kind of mesh being used here and the generated scene is purely random, so it’s not necessarily a good representation of how this system would work on a concrete real-life example. It does run well in my game, so you might want to check that out for comparison, but I believe your usecase is quite different.

hope that answers your questions