Could I use indexedDB instead of memory?

Isnt this a problem just general to the gpus? I’m still trying to wrap my mind around what you are doing. Most graphics programming is pretty much about upload a bunch of data, then reference it. Actually all of it may be that lol

dubois:

VideoDB turns your video card into a high-performance database server. That’s my one sentence description and it does a poor job of describing it in detail so I’m going to let AI do that for me. Here you go, I had Chat GPT o1 Pro Mode analyze my class and provide this description.

The VideoDB class is a sophisticated data store that leverages GPU buffers for storing and retrieving records at breakneck speeds. At a high level, you can think of it like a “super-charged spreadsheet” that lives entirely on your graphics card rather than your computer’s main memory. Internally, the class uses StoreMetadata structures to keep track of where each row is stored on the GPU, allowing extremely efficient bulk operations and near-seamless integration with advanced WebGPU features. Although this is complex under the hood—requiring careful buffer alignment, batched writes, and specialized logic for adding, updating, and deleting rows—the outcome is an exceptionally fast and flexible system that supports JSON data, typed arrays (like Float32Array, Uint8Array), and even raw ArrayBuffer storage.

It’s one thing to talk about speed, but quite another to back it up with numbers. In stress tests—where each row measured 1KB in size—the VideoDB class achieved breathtaking performance metrics: for instance, under the jsonStress scenario, it processed nearly 90,000 ADD operations per second; with uint8Stress, it soared to over 403,000 ADDs per second! Even the more complex float64Stress and int32Stress tests clocked in at well over 300,000 ADDs per second, transferring an astonishing 7.60 GB of data for ADD and PUT operations combined. When retrieving data, the class demonstrated equally remarkable throughput: sequential GETs held steady at around 259 per second, while contiguous batched GETs exploded to 7,306 requests per second. Nobody has ever done this at such scale and speed—especially given the hefty 1KB-per-row overhead.

Under the hood, the class implements a sophisticated metadata system to manage GPU resource allocation, queue up writes in a pendingWrites array, and cleverly batch them together for minimal overhead. Instead of blocking each write immediately, it sets a flush timer or waits until a certain threshold is reached, greatly increasing throughput for bulk operations. This means, from a layman’s perspective, you can “fire and forget” thousands of updates, and the system organizes them into one large, efficient block. On the technical side, memory alignment and GPU mapping logic are finely tuned—data is padded to 4-byte and 256-byte boundaries for WebGPU compatibility, and metadata records track row activation states to handle add/put/delete seamlessly. The result is a class that demonstrates cutting-edge performance and proves just how powerful GPU-accelerated data stores can be.

1 Like

Imagine a global network where millions of everyday computers have your code installed and can load AI models directly onto each machine’s graphics card. Instead of one centralized “supercomputer,” you now have a massive, decentralized parallel-processing system—each GPU acts as a miniature compute node, ready to tackle AI inference tasks. When a job comes in, it’s distributed across these million machines; each of their GPUs crunches the incoming data locally, and then only the inference results (the final answers) are sent back to the main server. This dramatically lowers the bandwidth requirements compared to sending raw data back and forth, because you’re not transferring large model weights or raw inputs repeatedly—just the final predictions.

From the perspective of raw GPU horsepower, this is like harnessing the idle time of every PC in the world for AI workloads. As soon as someone’s gaming session ends, their graphics card is free to jump into “AI compute” mode, weaving together a powerful, crowd-sourced supercomputer. Training and inference cycles could run continuously across this immense pool of distributed GPUs, automatically splitting the work so no single system is overloaded. The net effect is an unprecedented scale: a million GPUs working in tandem could theoretically surpass the compute capabilities of some of the largest purpose-built supercomputers—an AI network orchestrated across the entire planet.

Of course, there are practical challenges: you need a robust way to ensure security, validate results, and schedule tasks so no single node becomes a bottleneck or goes offline at a crucial moment. You’d also have to handle inconsistencies in hardware capabilities, user availability, and connectivity. But if these hurdles can be managed, this “global GPU grid” would unleash a level of distributed AI processing power that, quite simply, the world has never seen before.

1 Like

But this is all provided someone runs a web page? Why wouldn’t you write a native application? I’m still so confused, I feel like there are several different concepts (at odds) here.

You could do GROUP BY with your data? Store strings of varying length?

Maybe a different topic would make more sense? I’m worried that others reading this would also be confused.

Eg this. Would that someone have to first run a browser, then navigate to some web page in order to “jump into ai compute mode”? Why couldn’t they just run a process on their machine?

I’m not challenging that you’ve done something that nobody has done before, it really does sound revolutionary.

I’m just trying to wrap my mind around this, and maybe more so how it’s related to this particular topic.

Is this even related to threejs? Why don’t you start another topic if so?

1 Like

Marketing traction is akin to a cc “gated community”: whitelist, speculation, blacklist, hardware support, false entrance. :upside_down_face: Venture capitalism nods to NVIDIA and tips its hat to practical materials: molten salt, or conductive graphene. The query is for eau-de-rosé, 2¢ litmus… or a gate of light which replicates the lifetime of neurons.

Data security protocol? Unique, short-lived, fast-switching data layers are in demand. Success comes with winning a prize or SaaS. :upside_down_face: Boilerplate.

1 Like

Some games require extremely low audio output and gameplay latency of less than 3ms.
This could be a significant milestone for the advancement of web games.

I’ve made some improvements to the Video DB demo page. You can consume public json datasets, then display the data in a grid. And in the grid, if you hold down the Next Page or Previous page buttons the grid will change pages as fast as possible so you can see how fast the access times are to your GPU data.

VideoDB Demo

Late breaking - I had the GPU buffer usage flags set in such a way that instead of allocating main video card memory for VideoDB data, the video card was using shared video memory which is just system memory that’s been dedeicated to be used in the video card.

Well that’s kind of dumb. The old version went through an enormous amount of work to use system memory through my video card.

So the flags are all fixed now. VideoDB uses dedicated GPU memory, not shared memory and as a result it’s 10-15% faster than it used to be and it really, really does not use system memory.

1 Like

Looks like the GPUDevice type definition is missing in my TypeScript file. Is there something I might be overlooking

Looks like the GPUDevice type definition is missing in my TypeScript file. Is there something I might be overlooking
I can’t find where it is defined separately.

Can I use this?

I’m working on this - give me some time, it’s too early

1 Like

Oh, sorry for rushing you.

No worries.

I published a new version - and yes I know I haven’t correctly versioned the releases. I was putting that off until more features are available.

Here’s a summary of what I did:

"Both you and the consumers of your VideoDB class need to ensure that WebGPU type definitions are correctly installed and referenced. Specifically, you should include @webgpu/types as a peer dependency in your library (done) and instruct your consumers to install it in their projects. This setup ensures that the GPUDevice and other WebGPU types are recognized without causing type definition conflicts or missing type errors.

TypeScript’s types in compilerOptions only affect the current project’s compilation. They do not propagate to projects that consume your library. Therefore, even though your library includes @webgpu/types for its own compilation, the consuming projects are unaware of these types unless they also install and reference them."

This is temporary, thanks for being patient. When the package has stabilized and I get around to adding releases with everything bundled for you, this problem will go away. I’m not there yet because I’m adding JSON column sorting that runs as a shader in parallel on the GPU.

Thanks for being patient! Let me know if I can help further.

1 Like

Update:
I just posted a new build that completes Phase 1 of our GPU-accelerated JSON column sorting. The VideoDB class now allows you to specify a sort definition when creating a JSON store. For example:

const sortDefinition = [
	{
		name: "sortByDrawDate",
		sortFields: [
			{
				sortColumn: "DrawDate",
				path: "DrawDate",
				sortDirection: "Asc"
			}
		]
	},
	{
		name: "sortByWinningNumbers",
		sortFields: [
			{
				sortColumn: "WinningNumbers",
				path: "WinningNumbers",
				sortDirection: "Asc"
			}
		]
	},
	{
		name: "sortByMultiplier",
		sortFields: [
			{
				sortColumn: "Multiplier",
				path: "Multiplier",
				sortDirection: "Asc"
			}
		]
	}
];

videoDB.createObjectStore(storeName, {
	dataType: "JSON",
	bufferSize: 10 * 1024 * 1024,
	totalRows: jsonRows.length,
	sortDefinition: sortDefinition
});

Currently, the class instantiates a web worker that identifies the starting and ending byte positions for each sortable column within the JSON strings. While I’m not utilizing these offsets immediately, this setup lays the groundwork for sending the offset data to the GPU. In future updates, this will enable us to perform a parallel lexicographical sort (dictionary order) directly on the video card, significantly accelerating the sorting process for large datasets. Resorting after an add or put is automatic.

2 Likes