Unambiguous model definition when loading and saving a model

Hello!

I would like to first say thank you to all those who have ever helped me, I am trying to create my own learning project and at this stage, I need help from people who have had some experience with this, I hope I get an assessment of what I should do or at least what direction to take, thank you in advance to all who are reading this. :heart:

I’m addressing here, because in the process of working with Three.js I’ve encountered issues related to loading, analyzing and storing 3D models. My main goal is to understand what approach of model identification and storage is optimal in my case, and how to provide it correctly.

The gist of the problem
Users can upload GLB/GLTF files (or file folders) like on the pic

to a canvas where models are visualized via GLTFLoader.

I want to implement a system that, in general, when a user loads a model on the canvas, offers him the possibility to save the model, if yes, then it would be able to open it on the canvas without files when the user accesses it again, taking it from the storage.

That is, for this I realized what is needed (correct me if wrong):

Unambiguously determine if such a model has been loaded before
Avoids duplicate storage of identical models
Allows each user to modify the model on the client (nothing new), while simply keeping the original unchanged.

Let me start with the fact that I looked through some similar discussions such disc.

So I have some questions regarding partially and three.js
1)Unique model identification:
When loading a model (either a single .glb file or a set of files) via GLTFLoader, a scene object is obtained. Initially I thought to use its UUID, but this identifier changes on page reload, which is not suitable for static identification.

Like on the pic :

I am considering writing the generated identifier to the userData field of the model before loading it on the client. However, a problem arises: if the same file is uploaded by different users, the identifiers written to userData may be different.

Pic below:

And here’s a possible discussion on how to identify models link

Question: how can you get initially when processing the model in GLTFLoader or before that some identifier or assign a static identifier that would uniquely identify the model regardless of the user, and allow to link the record in the database (for example, in the table where the user stores a link to the model) with the already loaded model?

2) Storing changes
Which is better:

Modify the original GLB file, keeping a new one when user modifications are made
Store the original model unchanged in cloud storage (e.g. S3) and store all changes (position, color, materials, userData, etc.) in the database?

3) Handling file folders
What is the best way to handle the situation when a user uploads not a single GLB/GLTF file, but a folder with multiple files? Options:

Archive the contents of the folder to ZIP and store as a single object
Process each file separately and build relationships between them in a database
Maybe other approach?

4) Proposed architecture
I propose the following structure:

Models {
modelID: I don’t know what this is yet,
storageURL: “link to the file in S3/other storage”,
originalFilename: “original filename”
}

UserModels {
userID: “userID”,
modelID: “model ID from the Models table”,
modifications: { /* user modifications */ }
}

Maybe i’m wrong

Thus, my task boils down to the following: when loading a 3D model (be it a single file or a set of files), it should be displayed on the canvas, as well as provide the possibility of its optimal storage and subsequent recovery with unambiguous identification to avoid duplication of data, if such a model is already in the system.

I would be grateful for your advice and recommendations on how to organize the process of model identification and storage using Three.js.

The only way to de-dupe that I can think of is to hash the files involved. There are no other stable unique identifiers.

If you use binary gltf (.glb) this may be easier because textures can be embedded in the single .glb file… thus you’d only have to de-dupe at a single file level.

r.e. opening multi files.. you just have to make your loader aware of multiple files, and prioritize finding the gltf/glb first.. and then you may also have to use the LoadingManager . resolveURL construct to redirect texture load requests from your gltf loader, to whatever texture blobs are in your user upload queue.

Whether you decide to store uploads as zip files is a design choice. If you are using compressed/optimized GLBs, your compression ratios may not be worth the hassle.
For unoptimized assets, or text format gltf it might be big win, but you may lose some duplication granularity and end up storing duplicate textures, to the benefit of having the model+file remaining a portable unit.

r.e. storing the original asset, and then your modifications as diffs is a nice idea but introduces a whole lot of complexity.. the value it gets you is having a modification history, and perhaps the ability to re-apply edits to an externally modified asset.. though the logisitics of that also sounds really complex.. .like if changes applied to a model part that no longer exists.. etc. Seems somewhat dubious and error prone.

I would get the basics working…

Allow loading of .glb (binary gltf) or multi part .gltf + textures…

Export them via GLTFExporter as binary gltf ({binary:true}) on the exporter settings…
Get the hash of the glb. (or optionally a zip of it, plus whatever metadata you have)
Store it if there isn’t a match. (deal with collisions?)

That by itself gets you close to sketchfabs level of storage functionality I think.

2 Likes

Appreciate your reply.

You know, I have so much to ask, because I’m getting more and more confused about this topic, I hope you will understand my situation correctly, and we’ll figure it out.

Important part, I don’t pretend to have a perfect solution or a very complex variant.
My goal is to make it so that user A can resume viewing a model that he was viewing earlier, for example when he first simply dragged and dropped files, and at the moment, he does not have these files, but in his profile there is a history (a regular card with a resume button) and he can open it. As I see it, there is no obvious solution to uniquely identify the model, so that at the stage when the user drags files to the canvas and wants to save them, to determine whether the same model was already in the profile, except for the hash, as you described.

I’d like to clarify a few things.
1)

I’ve looked into a lot of things,If you are familiar with this topic, can you tell me if this is what I am looking for?
GitHub Talk

As I understood the possibility of implementing id, maybe these examples would work in my situation(I mean, during runtime, when I drag and drop files onto the client and make some other maybe server side logic to create id) :
1)GitHub - KhronosGroup/glTF-External-Reference: glTF Experience Format (glXF)
2)glTF/extensions/2.0/Khronos/KHR_xmp_json_ld/README.md at d8b075e918e6ba52a87a85232ee118fa7b87a23c · KhronosGroup/glTF · GitHub

or your solution is the most correct?

2)

Regarding this question, I have been helped by @donmccurdy I borrowed and slightly modified the original idea of transferring from blobs to normalised parts for GLTFLoader processing and also for custom materials which is not much different

It looks like this

import { useEffect, useState, useMemo, useCallback, useRef } from 'react'
import { LoaderUtils, Cache } from 'three'
import {
	GLTFLoader,
	GLTFParser,
} from 'three/examples/jsm/loaders/GLTFLoader.js'
import { FBXLoader, TGALoader } from 'three/examples/jsm/Addons.js'
import { DRACOLoader } from 'three/examples/jsm/Addons.js'
import { KTX2Loader } from 'three/examples/jsm/Addons.js'
import { LoadingManager, REVISION } from 'three'
import { useLoader, useThree } from '@react-three/fiber'
import { useMyContext } from '../../MyContext'
import { WebIO } from '@gltf-transform/core'
import {
	ALL_EXTENSIONS,
	EXTMeshoptCompression,
	KHRONOS_EXTENSIONS,
} from '@gltf-transform/extensions'
import { metalRough } from '@gltf-transform/functions'
import {
	MeshoptDecoder,
	MeshoptEncoder,
	MeshoptSimplifier,
} from 'meshoptimizer'
import draco3d from 'draco3dgltf'
//Function of creating a modal window for model with KTX2 texture error(just UI with 2 buttons OK-true and Cancel-false)
import createDialog from './ModalDialog'

interface LoaderProps {
	url: string
	rootPath: string
	assetMap: Map<string, File>
	fileType: string
	rootFile: File | string
}

const useModelLoader = ({
	url,
	rootPath,
	assetMap,
	fileType,
	rootFile,
}: LoaderProps): any => {
	const { gl, scene } = useThree()
	// Function for traverse materials(like in viewer.js file (DonMcCurdy))
	const traverseMaterials = useCallback(
		(object: any, callback: (mat: any) => void) => {
			object.traverse((node: any) => {
				if (!node.geometry) return
				const materials = Array.isArray(node.material)
					? node.material
					: [node.material]
				materials.forEach(callback)
			})
		},
		[]
	)

	const cleanup = useCallback(
		(model: any) => {
			if (!model) return

			scene.remove(model)

			model.traverse((node: any) => {
				if (node.geometry) {
					node.geometry.dispose()
				}
			})

			if (model.animations) {
				model.animations.forEach((animation: any) => {
					if (animation.clip) {
						animation.clip.dispose()
					}
				})
			}

			traverseMaterials(model, (material: any) => {
				if (material.dispose) {
					material.dispose()
				}
				for (const key in material) {
					if (
						key !== 'envMap' &&
						material[key] &&
						material[key].isTexture &&
						material[key].dispose
					) {
						material[key].dispose()
					}
				}
			})
		},
		[scene, traverseMaterials]
	)

	const { setIsViewerVisible } = useMyContext() as any
	const [content, setContent] = useState<any>({
		scene: null,
		clips: null,
	})
	const [errorOccurred, setErrorOccurred] = useState(false)

	const MANAGER = useMemo(() => new LoadingManager(), [])
	const THREE_PATH = `https://unpkg.com/three@0.${REVISION}.x`

	const DRACO_LOADER = useMemo(
		() =>
			new DRACOLoader(MANAGER).setDecoderPath(
				`${THREE_PATH}/examples/jsm/libs/draco/gltf/`
			),
		[MANAGER]
	)

	const KTX2_LOADER = useMemo(
		() =>
			new KTX2Loader(MANAGER).setTranscoderPath(
				`${THREE_PATH}/examples/jsm/libs/basis/`
			),
		[MANAGER]
	)

	// Function for load model with KTX2 texture
	const loadModel = useCallback(
		async (url: string, loader: any, blobURLs: string[]) => {
			try {
				const io = new WebIO({ credentials: 'include' })
					.registerExtensions(ALL_EXTENSIONS)
					.registerDependencies({
						'meshopt.decoder': MeshoptDecoder,
						'meshopt.encoder': MeshoptEncoder,
					})

				console.log('Initializing WebIO:', io)

				const gltf_trans = await io.read(url)

				// Show dialog and wait for user's decision(Ti load custom KTX2 textures or not)
				const continueLoading = await createDialog()

				if (!continueLoading) {
					setIsViewerVisible(false)
					blobURLs.forEach((url: any) => URL.revokeObjectURL(url))
					return null // Return null to indicate that loading was canceled
				}

				// like in example=============================
				await gltf_trans.transform(metalRough())

				const glb: any = await io.writeBinary(gltf_trans)

				if (loader) {
					return new Promise((resolve, reject) => {
						loader.parse(
							glb.buffer,
							'',
							(gltf: any) => {
								resolve({
									scene: gltf.scene,
									clips: gltf.animations || [],
								})
							},
							(error: any) => {
								reject(error)
							}
						)
					})
				}
				//==============================================
			} catch (error) {
				console.log('Ошибка загрузки модели:', error)
				setErrorOccurred(true)
			}
		},
		[setIsViewerVisible]
	)

	// Function for load model with fallback loader(like in viewer.js file (DonMcCurdy))
	const loadWithFallbackLoader = useCallback(
		async (url: string, otherLoader: any, fileType: string) => {
			return new Promise((resolve, reject) => {
				otherLoader.load(
					url,
					(object: any) => {
						console.log('Loaded model with fallback loader:', object)
						let loadedScene = null
						let clips = null

						if (fileType === 'gltf/glb') {
							loadedScene = object.scene || object.scenes[0]
							clips = object.animations || []
						} else if (fileType === 'fbx') {
							loadedScene = object
							clips = object.animations || []
						}

						if (!loadedScene) {
							reject(new Error('No scene found in the model'))
							return
						}

						resolve({ scene: loadedScene, clips })
					},
					undefined,
					(error: any) => {
						console.error('Error loading model with fallback:', error)
						reject(error)
					}
				)
			})
		},
		[]
	)
	const isLoadingStarted = useRef(false)
	useEffect(() => {
		if (!url) return

		const blobURLs: string[] = []

		MANAGER.setURLModifier((someUrl: string) => {
			const baseURL = LoaderUtils.extractUrlBase(url)
			const normalizedURL =
				rootPath +
				decodeURI(someUrl)
					.replace(baseURL, '')
					.replace(/^(\.?\/)/, '')

			if (assetMap.has(normalizedURL)) {
				const blob = assetMap.get(normalizedURL)
				const blobURL = URL.createObjectURL(blob!)
				blobURLs.push(blobURL)
				return blobURL
			}
			return someUrl
		})

		let loader: GLTFLoader | FBXLoader | null = null
		let otherLoader: GLTFLoader | FBXLoader | null = null

		if (fileType === 'gltf/glb') {
			loader = new GLTFLoader(MANAGER).setMeshoptDecoder(MeshoptDecoder)
			otherLoader = new GLTFLoader(MANAGER)
				.setCrossOrigin('anonymous')
				.setDRACOLoader(DRACO_LOADER)
				.setKTX2Loader(KTX2_LOADER.detectSupport(gl))
				.setMeshoptDecoder(MeshoptDecoder)

			MANAGER.onLoad = () => {
				console.log('All textures loaded successfully')
			}

			MANAGER.onError = url => {
				console.error('Error loading texture:', url)
			}
		} else if (fileType === 'fbx') {
			otherLoader = new FBXLoader(MANAGER).setCrossOrigin('anonymous')
		}

		const loadModelAsync = async () => {
			try {
				//First try to load model with KTX2 textures
				const result = await loadModel(url, loader, blobURLs)
				if (result) {
					setContent(result)
				} else {
					try {
						const fallbackResult = await loadWithFallbackLoader(
							url,
							otherLoader,
							fileType
						)
						setContent(fallbackResult)
					} catch (fallbackError) {
						console.error('All loaders failed:', fallbackError)
					}
				}
			} catch (error) {
				console.error('Primary loader failed, trying fallback:', error)
			} finally {
				// Clean up resources
				blobURLs.forEach(url => URL.revokeObjectURL(url))
			}
		}

		loadModelAsync().finally(() => {
			isLoadingStarted.current = false
		})

		return () => {
			// Clean up resources(once again XD for sure)
			blobURLs.forEach(url => URL.revokeObjectURL(url))
			DRACO_LOADER.dispose()
			KTX2_LOADER.dispose()
			useLoader.clear(GLTFLoader, url)
			if (content.scene) cleanup(content.scene)
			Cache.clear()
		}
	}, [url])

	return content
}

export default useModelLoader

I’m also thinking of making a server-side NodeIO(gltf-transform) to further simplify textures, which is one of the reasons why I want to store models(even in the worst case).

But, returning to the main goal
Regarding this quote, as I showed above in the photo of the gltf folder with files, I do not know whether the current version of the model loading code, which works for now for users who dump folders, will work for the case when I will return files from the storage (either .glb or zip) while writing to the DB, as shown above, at least the identifier, which will be defined as a hash (possibly) and a URL for a link to the storage). Although, in theory, how will the cases differ, whether a set of files/file came from the client, or almost the same set comes from another place

3)

If we are talking about assets/modifications that I, for example, will store in the DB table where I have UsersModel, then I think this is a completely different question, much more complicated than the previous ones, since at the very least I will probably need to track current changes to the model + somehow complete everything for each model so that it is applied correctly upon subsequent loading.
Unfortunately,I am far from fully understanding and the possibilities of implementing this part yet.

Generally
I would like to focus on the simple possibility of dropping a model on the client (which already exists code above) → its identification (at least some possibility of determining its similarity with other already stored models) → saving a single glb file in the storage or (what I really want to do) a gltf folder with files (apparently as a zip or as a converted .glb) → creating a corresponding entry in the DB (maybe for now just as a URL for storing data in the storage, and an ID) → providing the user (something like a history on the profile) where by pressing a button there will be the ability to resume the saved model on the canvas (that is, taking the URL from the storage, and bringing it to the canvas (I hope the download code that I gave ± will be able to do this and i’ll also see what .resolveURL can help me, before that, if necessary,I’ll unzip the zip file or just .glb) → the model is displayed (I will somehow decide so that in this case there was no longer a save button for current model, unless I implement a method for saving changes to the model so I’ll be able to store not only initial model but its changes)

Finally, I will briefly describe my understanding of your proposal.

You say that the files received at the very beginning should be exported via GLTFExporter (if the original version is not .glb) to make them into 1 binary version, then hash this version again, or as an option, what you mentioned, make a zip file instead of .glb and also hash it, and then access the database and check if there was such a hash (it is clear that I will not leave the case when the same textures are encoded in 2 .glb), then if not, then I simply save this version in the storage and create a record with an ID as a hash (apparently). And this is not taking into account any metadata.

Do you really need such a complex system to detect duplicate files?

I’ve asked ChatGPT to estimate a full monthly budget based on the following:

  • Storage: 1 TB
  • Traffic (egress): 10 TB
  • Requests: 100 million HTTP GETs
  • Region: Primarily North America + Europe
  • Preference: Cheapest solid option

:money_with_wings: Estimated Monthly Costs

Provider Storage (1 TB) Traffic (10 TB) 100M Requests Total Notes
Bunnyt $10.00 $10.00 $1.00 $21.00 Fast, built-in storage
Cloudflare R2 $15.00 $0.00 Free $15.00 If all traffic via CF CDN
Backblaze B2 + CF $5.00 $0.00 Free $5.00 Lowest cost, needs setup
Amazon S3 + CloudFront $23.00 $85.00 $4.00 $112.00 Enterprise-grade
KeyCDN $10.00 $40.00 $1.00 $51.00 Simple pricing

The simplest, affordable setup:

  • $10.00 storage (1 TB)
  • $0.01/GB traffic → $10.00 for 10 TB
  • $0.01 per million requests → $1.00 for 100M
  • Total: ~$21.00/mo

:package: Estimated glTF Files Fit in 1 TB

glTF Type Avg Size # in 1 TB
Low-poly (no textures) 500 KB ~2M
Tiny textures 1 MB ~1M
Compressed textures 3 MB ~333K
Medium textures 5 MB ~200K
Full textures 10 MB ~100K
Animated/full scene 25 MB ~40K

:clamp: Estimated compressed glTF (.glb) in 1 TB

Asset Type Avg Size # in 1 TB
Low-poly (Draco) 100–300 KB ~3.3M–10M
Stylized (small textures) 500 KB ~2M
Mid-poly (Draco+WebP) 1–2 MB ~500K–1M
Optimized detail (2K maps) 3–5 MB ~200K–333K
Characters/scenes 8–10 MB ~100K–125K
1 Like

Yes, thank you for your reply, I understand that perhaps the costs are small, I just have the task of the educational format, I was primarily interested in the idea of implementation as such an opportunity and not its benefits.

You may enjoy this cautionary primer:

1 Like