Procedural Instanced Forest - High Performance "Real" Trees

Hey everyone!

I built a GPU-friendly procedural forest system for Three.js that renders hundreds of fully 3D trees at real-time framerates. This fills a gap I noticed: there are plenty of examples showing either one beautifully detailed procedural tree, or billboard-based forests for massive scale, but not many showing the happy medium: a good-looking forest with actual tree geometry that still performs well.

Features:

  • Fully procedural L-system-style branching with 5 tree type presets for variety

  • Instanced rendering

  • Vertex-shader LOD culling

  • Distance-based leaf shrinking with bark green-tinting to hide LOD transitions

  • Per-leaf sway animation with distance cutoff

  • Pre-computed per-instance attributes

  • Procedural leaf texture with veins generated at runtime

  • Procedural bark texture with per tree brightness and hue shifts

  • Surface root spread simulated by flaring out vertices near ground level

  • Configurable: tree count, forest radius, branch levels, leaf density, LOD distances, shadow quality

Performance Approach:

  • Instancing is the foundation — InstancedMesh for both bark cylinders and leaf quads

  • LOD happens in vertex shader: distant leaves get gl_Position = vec4(0,0,0,1) which GPUs cull before fragment stage

  • Sway animation skipped beyond 50 units

  • Bark shifts toward green at distance to compensate for culled leaves; essentially free visually

  • Shadow quality toggle (or disable entirely) for tight budgets

  • Pixel ratio capping on mobile

  • Runs 60fps on mid-range desktop GPU

  • Configurable LOD distances let you trade quality for performance

The code is heavily commented explaining the “why” behind each optimization.

Live Demo on CodePen →

6 Likes

Update: Custom Per-Tree Culling for Mobile

A friend challenged me to push performance further on mobile devices and low-end GPUs.

The challenge: I wanted to keep the large unified leaf and bark meshes to maintain the lowest possible draw calls (just 2!), but needed a way to skip rendering trees that are outside the camera frustum.

Enter “Custom Per-Tree Culling”

I created a Separate Demo** on CodePen →** and left the original clean and untouched, because this is a bit niche.

The approach:

  • Each tree gets a bounding sphere computed at generation time

  • Every frame, I test each tree’s sphere against the camera frustum on the CPU

  • Visible trees have their instances packed to the front of the buffer arrays

  • mesh.count is set to only the visible instance count

  • GPU only processes what’s actually on screen

The tricky part was keeping all per-instance attributes in sync (matrices, colors, wobble values, sway phases) they all need to be reordered together when visibility changes.

Key implementation details:

  • Camera movement throttling (only recalculate when camera moves >0.5 units or rotates >0.01 rad)

  • Track previous visibility state to skip rebuilding when nothing changed

  • DynamicDrawUsage on buffers since we’re updating them frequently

  • Pre-allocated typed arrays to avoid GC pressure during reordering

Results: it depends on your use case.

On mid-to-high-end GPUs, custom culling likely isn’t worth it. Modern GPUs chew through the vertex shader LOD culling without breaking a sweat, and you’re just adding CPU overhead. This optimization is primarily for mobile and low-end GPUs that struggle with high instance counts.

It also depends on forest density. Sparse trees spread over a large area? Custom culling wins: you’re only ever seeing a fraction of them. Dense trees packed tightly? You’re probably rendering most of them anyway, so the CPU overhead isn’t paying off.

That said, if you are on constrained hardware:

  • Sweeping overviews / fast camera movement: Stick with native Three.js culling (disable custom). The CPU cost of constantly recalculating visibility during quick turns can hurt, and the vertex shader LOD handles distance culling anyway.

  • First-person / limited FOV / slow movement: Custom per-tree culling shines here. When you’re only seeing 20-30% of the forest at a time, the GPU savings are massive. Went from ~15fps to solid 60fps on my old iPhone in this scenario.

  • Stealth game sneaking through trees: Definitely custom. Camera moves slowly, visibility changes infrequently, and you’re only rendering what’s directly ahead.

  • Fast-paced action with quick 180° turns: Probably native. The sudden visibility changes mean constant buffer rebuilds, and you might be looking at most of the forest anyway.

Toggle “Custom Per-Tree Culling” in the Performance folder to compare for your specific scenario. Note: toggling regenerates the forest since it requires different buffer configurations.

3 Likes

No Christmas lights?

Actually, you might throw some pine trees in there. They can be constructed using very few triangles - just a vertical trunk with slightly cone-shaped horizontal rings of, say, 6-8 branches which increase in size as you near the bottom. If you find a ring-shaped texture, you can use a single texture for each ring. So, maybe 50-100 triangles per tree.

Forgot to add: Excellent job, very few draw calls, etc. It’s hard to find anything that is efficient at drawing large numbers of deciduous trees in three.js.

1 Like