For metaverse mass adoption (e.g. Ready Player One) which technology do you think will have an edge in the next years?
WebGPU has the advantage to run locally on the device so you can save resources server side, but you are limited by the device cpu/ram/gpu/… (although maybe the next generation VR headsets will be very powerful).
Pixel streaming on the other hand has the advantage to use a top tier system & gpu on the cloud to render ultra realistic experiences and stream them to the client which could be anything from a smartphone to a tablet to a tv, it just need a good connection (which is likely with the diffusion of 5G). The major disadvantage is the cost, since for each client session you have to allocate rendering resources on the server and it would be basically impossible to offer this service for free, even in the future I guess.
I have no clue which one will be the winner in the long run. What do you think?
If the meta verse is about immersion, then a tablet or TV isn’t a valid target device.
If the meta verse is a replacement for web content (something like Fortnite, but vendor neutral), then streaming might make sense.
I think AR devices (that can flip to VR devices when needed will) become the dominant format. The headset will just become the screen. The compute will be an upgradable puck connected to the headset. Think a phone, but without the screen. This way, you can sell two devices that have different hardware options and capabilities.
Lightweight headset for daily casual use - ear-pod version
Heavier headset for full immersion and sound - audio headphone version.
Lightweight compute for daily mobile use - video calls, chats and streaming media consumption
Midweight compute for business and meta verse with longer battery life
Heavy weight compute for intense realistic gaming, movies, plays and art appreciation
Having both the compute and screen in the same place (around the head) will be temporary.
I think telepresence will be a thing. Imagine being present in first person when we set foot on the moon again or Mars.
Thanks for the interesting points! Yeah I think the core of the discussion is about local computing vs remote computing. I can foresee a battle between the new decentralization trend (web3) and cloud computing where a massive number of vCPUs, vGPUs and other resources are available for the clients to use remotely on their devices. Although there are interesting projects to make cloud computing decentralized (e.g. https://internetcomputer.org/) I think that cloud computing nature is and will remain mostly centralized, only big tech and institutions have the resources to build that kind of infrastructures.
Probably there is and will always be space for both, although must be noted that an AR/VR headset that just need to stream content from a remote service without the need for intesive local computation can be build relatively easily even now, while for local-capable photorealistic VR headsets we will need to wait a while.
I personally think that pixel streaming is an economic dead-end, as it greatly maximizes the cost (in CPU/GPU and bandwidth) per user. The major benefit from pixel streaming is that you get higher-quality graphics on lower-spec machines. For that to be meaningful, I think you’d need a few things to be true:
higher-quality graphics being a major driver of metaverse adoption.
I would argue that this is simply not true. The most successful metaverse and metaverse-adjacent products (Roblox, Sandbox, Decentraland, Second Life, Minecraft) generally don’t compete on high-fidelity graphics
devices capable of running 3d worlds would have to be very expensive.
This is also not the case- even a multiple-years-old smartphone is capable of running interactive 3d content
it would have to be economically viable to build a business around pixel streaming
This seems to be far from the case at present. Paying for all those GPUs and all that bandwidth is expensive, and will severely undercut the profitability of any such enterprise. This is compounded by one of the best cases *for pixel streaming being the ability to support low-spec devices. However, people using low-spec devices likely also have less money to spend within virtual worlds. And sure, money isn’t everything. But if you want to support the requisite amount of engineers/artists/etc needed to build a metaverse, it needs to be economically viable.
Also, pixel streaming has been touted as the next big thing for gaming for over a decade and yet hasn’t really materialized. Granted, that doesn’t mean that it never will, but if it can’t be successful within the world of games, I think it’s unlikely to be successful for virtual worlds.
Let’s say you manage to build a successful virtual world, and you suddenly have a million users- what does that do to your server costs?
If you build around WebGPU / local clients, your costs probably don’t increase too badly, and you can afford to keep growing the community and adding users.
If you build around pixel streaming, you have a significant cost for every connected client, which could mean that you might run out of funding or have to throttle new account creation.
I put very little hope in streaming. At the point we improve streaming we are probably going to improve graphics as well, I don’t think the two will ever converge
there is no need for backend server, each device is a processing unit for game, metaverse isn’t just about wearing googles it can also be a tablet or mobile based 3d virtual experience, check this threejs + cannon if the game level assets are designed well it can be a good basis for metaverse experience, to create a threejs metaverse we have to colaborate to achieve such project.
Really, are we this blind?
The technology has been there since decades. Metaverse exist and mmorpg and many other universe are already a thing, gathering millions of users.
It’s just people are frustrated because it’s not their project, not their vision.
They don’t want to use these metaverse, they want to rule them.
developers can wait and debate for hundred years about some utopia to happen
the real walls preventing them to achieve something is their ego
But you’re really asking which tool is better, a chisel or a screw driver?
Both approaches will co-exist. If you need high fidelity on a low powered device, pixel streaming is the only option. The costs, as you mention, dictate that this is for applications with a limited amount of users, or users willing to pay a lot for the service.
Locally rendered solutions are more appropriate for applications with mass appeal, and mobile GPU’s only ever get bigger and faster. I don’t see WebGPU having much impact in the next couple of years.
For example the three examples don’t run on canary on my Pixel 7.
This rather absurd conversation got me thinking, is there any research for gaming or some other multi-client application about a rendering method where the scene is rendered once and a localized view of that is somehow streamed to all clients.
Render one ”master scene” on server that is like a volumetric render that covers all angles and surfaces. Then you’d stream personal ”cropped” view of that to all connected clients based on their camera position. Maybe as rasterized image data but more ideally as some vector data with rendering instructions for as efficient client-side rendering as possible.
A bit far fetched in all seriousness though. Not much you can prerender from every angle in conventional graphical scenes that would give you any perf advantage. Maybe just some volumetric smoke simulations where rendering isnt the bottleneck anyways, so you could just send the whole computed volume for client to render anyways.