What happens on GPU itself isn’t profiled, for GPU-side expensive tasks and shaders you basically only get a huge idle of nothingness with the CPU waiting. The 1-36ms in stats rather refers to the average span recorded.
I’ve made a plugin few days ago to visualize draw call cost, it doesn’t give or estimates timings yet, but you will see a weighting of every mesh being rendered how costly it is compared with each other. However, you might check your code first for obvious highly costly operations first, especially with particles and massive overdraw you can make GPUs sweat easily.
I’m using textures to compute a particle simulation, so as I increase the texture size the FPS goes down as expected.
Can you explain this in more detail? What size and what are you computing? Are you using the gpgpu renderer?