and all of those engines step through the entire world with one function
That is incorrect. Most/all of the good engines identify simulation islands of activity automatically and only simulate “islands” that contain active/moving objects. If you want to pause regions, you can just put the corresponding objects to sleep.
You can have a world of 1 million objects, and only the ones that are near each other and contain an actively moving dynamic/static pair will be touched in the the simulation pass.
I’m mentioning this, because it sounds like you will basically be re-implementing this scheme by hand… which is possible, but if you’re going to the level where you’re simulating full rigid body dynamics, then you’re basically re-implementing a physics engine, but presumably without the benefit of a math phd. If this is just a learning project, then I guess that’s not the end of the world… but you may learn more by attempting to resolve one of these collisions on paper.
Already the idea that you’re only resolving a collision after a box shapecast intersect happens, is a bit of a red flag… since then it’s likely that you are attempting to resolve an interpenetration condition after the fact, without the corresponding contact manifold datastructures that are commonly used to prevent interpenetration during the normal simulation phase. (unless you are expanding your shapecast collision box by the length of the per frame movement vector, in which case you do stand a chance of detecting the collision before it happens, and thus you can record it in a list of potential contact points which can be tracked over time.)
( this is what most physics engines do).
Then there is the fact that what you described Sounds easy… for box->box collision, but these boxes also still have to collide with the rest of the world in a meaningful way… like falling onto/orienting towards other geometry, not to mention stacking of objects… So… a box… falls on another box. Who gets out of the way? The box that is moving? Perhaps it moves half way back, the other box is moved halfway forward… then you’ve created a new interpenetration as the second box is now penetrating the ground. If that interpenetration puts that boxes centerpoint past the geometry of the ground, then it falls through the ground. (Which you’ve probably seen in many many games before) or, perhaps the second box was blocking a door, but the collision response has now moved it past the door, even though it shouldn’t have been able to fit through it. You can solve these problems by keeping a list of contact points per dynamic object, and not allowing them to violate these point constraints during resolution… but it’s non trivial.
The big 4 engines I mentioned have had entire teams of engineers optimizing these interactions, in some cases for decades.
They are also usually written in languages like C and wasmed, and thus can achieve better performance than handrolled js in most cases.
Using a physics engine also enables other technique like continuous collision detection, for fast moving objects, (along with a whole host of other functionality that you wont know you need until you need it… like friction, motors, constraints like hinges and springs, raycasting, broadphase collision checks.)
r.e. running server side… the main 4 engines I listed are also pretty platform agnostic.
They have wasm &| c | c++ | rust variants.
Check out the Jolt physics repo for more overview of the benefits: (especially the features section)
In summary:
If you’re trying to build something more complex than billiards or a sliding cubes puzzle… you probably want a real physics engine.