I’m working on a screen space reflections pass and one of the things I’m trying to improve at the moment is the performance. Right now I’m seeing ~10-15fps with the Chrome window maximized on a lower end Surface Laptop 2 with an i5-7200U CPU with integrated graphics. The first thing I want to understand is what’s taking the time so I can focus my efforts but the performance profiler is giving some confusing results. Here is how the performance tab looks. Neither the CPU nor GPU are being fully utilized:
You can see in the summary that there’s a ton of idle time and mostly unused GPU time. It looks like the GPU work isn’t starting for about 60+ms after it’s been queued and only taking ~8-9ms to run.
When I shrink the window the performance gets better leading me to believe pixel fill and fragment shading is an issue, which is what I would expect especially for this effect at the moment. Even some other simpler demos run less optimally on this machine when maximized. I’m a bit thrown by the profiler, though. Any ideas what’s going on? Let me know if I can provide anything else!
I’ve tested your link on my iMac. When utilizing the entire browser window, I get ca. 11 fps. When shrinking down the window, the performance gets gradually better. This is typical for fragment shader bound apps.
I get similar results like you when doing the performance analysis in Chrome and Firefox. I think the idle time comes from the fact that the GPU has such a long processing time per frame. And it seems the actual GPU overhead is not correctly visualized in the dev tools.
I get 60 fps in maximized window (GTX 1060 on this) also when increasing steps. I use a similar implementation and i remember i got this issue at some point too when there was too much rendered accidentally, but it disappeared at some point again like if there was a memory/cache issue or something. It is costly, especially since you also add more additional passes, i’m surprised it runs with integrated graphics at all Maybe you could try break it down first to see when it appears?
It definitely is, though i’ve encountered this issues a couple times before, though not every task is recorded or shows which side struggles, since he has integrated graphics it seems different from my case. It would be interesting to see when (and if) it disappears when breaking it down, removing passes/commenting out code till the idle disappears.
Like i said i only experienced this issue with some memory issue or when something probably crashed. What also lead to issues sometimes were conditional breakpoints around webgl calls or console output floods (which is the case here btw).
I’ve checked your linked sources and it’s the same ^^ i’ll check on it later and let you know. I also use multiple framebuffer attachments with webgl2 saving passes, rendering roughness as attribute to one and discard non-reflective areas, there is a lot room for optimizations but a notebook with integrated graphics will be hard, i get about 10 FPS on my 4 years old phone.
Did you find any solution to this problem? I’m seeing the exact same problem on an integrated graphics card, lots of idle time between the frames. I actually realized it when I tried to measure how long frames took to render:
The odd thing was that although it was obvious that there was only a few frames per second rendered, the timedelta still was around 50ms which does not make sense. Then when profiling i saw the same pattern: short rendering time and lots of idle.
If I run stats.js it will show an FPS that looks about right, around 4 but stats.js updates the time only in the start of every render call.
In principle I’m not so surprised the performance is low since we are talking about a scene with around 60M triangles but it would be nice to get a reliable performance measure that works also when not using an animation loop.
As has been mentioned previously in the thread I suspect the profiler is just not displaying the all of the GPU work being done that’s stalling the frame.
I actually realized it when I tried to measure how long frames took to render:
The odd thing was that although it was obvious that there was only a few frames per second rendered, the timedelta still was around 50ms which does not make sense.
Rendering on the GPU does not happen synchronously so measuring how long renderer.render will just give you the amount of time it takes to issue the draw calls not the amount of time the GPU is taking to render. If you have a lot complex geometry it’s very possible that issuing the draw calls could take 50ms while the GPU takes 250ms to finish drawing.
I never got around to creating a minimum repro case to submit to Chromium but it would probably be worth doing.
Yes, I kind of suspected that there is something like this going on, and that is enough for me. I just need to get some workaround to measure the actual time to draw the frame so that I can estimate the time when using OrbitControl without animate loop. But it should be possible. Just trying to downsample a bit on rendering when it is too heavy