Scaling up, followed by truncation

I know this has been briefly discussed before - in the context of floating-point imprecision issues for Boolean operations in gkjohnson/three-bvh-csg

It’s not caused by three-bvh-csg. It is caused by the way in which floating point numbers are stored on conventional digital computers (forget Quantum, etc!!!).

The idea is: If I know the min/max range of floats used in my program, I might be able to scale up all these floats (by some multiple of 10) and truncate the resulting floats to produce integers. And thereafter work in this integer world for my critical calculations. The issues to consider are (a) the numerical accuracy required and (b) whether the resulting integer range can be addressed with the number of bits I have for integer representation.

Does anyone have experience in doing stuff like this? Would like to hear, please.


Yeah! I just dealt with some data the other day that was storing latitude and longitude in a 32 bit integer format:
Funny you mentioned “quantum” because what you’re describing is also called “quantizing”. :smiley: Mapping some arbitrary float to a range of integers and back again.
Textures do this to map 0 to 1 rgb values to bytes.
It can be used like a form of lossy compression… where you store the min/max of some set of numbers, and then store the numbers in some unsigned integer range… and then

let decoded = (value*(max-min))+min;

The gltf meshopt compression tool uses this exact technique to “compress” meshes by crunching them down into unsigned integer formats, then using the meshes parent transform to scale/translate them back into the original dimensions.

1 Like