Generating EdgesGeometry Using Integer Hashes

To address the heavy processing time involved in generating EdgesGeometry for large 3D models, I attempted to improve the performance. By revising the use of hash strings to integer hashes and using Map, I achieved up to 5x faster processing. I’d like to share the details of this improvement and ask for your input and suggestions.

Problem Encountered

In my use case, generating EdgesGeometry for a large model took approximately 2500ms. To reduce this processing time, I investigated how EdgesGeometry works.

Investigation

Using Chrome’s performance tools, I analyzed the processing time of the EdgesGeometry constructor.

String Comparison Causes High CPU Usage

The investigation revealed that hash string comparisons were consuming more CPU resources than expected.

    // skip degenerate triangles
    if (hashes[0] === hashes[1] || hashes[1] === hashes[2] || hashes[2] === hashes[0]) {
        continue;
    }

Similarly, significant time was also spent on retrieving edge data using hash strings as keys.

    if (reverseHash in edgeData && edgeData[reverseHash]) {

Consideration of Improvement Ideas

Based on these findings, I considered the following improvement ideas:

  • Shorten hash strings using Base36 encoding.
  • Use integer values for hashes to avoid string comparisons.

Using Base36 encoding made the hash strings shorter and reduced some of the load. However, the hash generation process itself still took considerable time, resulting in limited overall improvement.

    hashes[0] = `${Math.round(a.x * precision).toString(36)},${Math.round(a.y * precision).toString(36)},${Math.round(a.z * precision).toString(36)}`;

Switching to integer hashes provided significant performance gains. Accessing an object with bracket notation required converting keys to strings, which negated the performance advantage. To avoid this, I changed edgeData to use a Map.

Modified Code and Demo

The modified code with integer hashes is available here:

I also created a demo comparing the performance with EdgesGeometry:

https://masatomakino.github.io/colorable-merged-model/demo/demo_FastEdgesGeometory.html

You can open the console to see the time taken to generate the edges.

Performance Improvement

With the modified code, edge generation became approximately 2 to 5 times faster. The impact varies by browser, with Chrome showing a 5x improvement, while Safari showed around 2x. In my use case, edge generation time was reduced from 2500ms to around 500ms, which is a satisfying result.

No Visual Changes

There are no visual changes, so users won’t be affected. On the demo page, you can adjust the distance between two objects to verify that the edges match by setting the distance to zero.

Change in Line Segment Storage Order

The modified code may result in a different storage order for line segments. This is because the order of for...in loops over an object is not guaranteed.

Questions

I’d like to ask for your feedback on this improvement and any other suggestions for further optimization.

Why Are Hashes Strings?

Why does the EdgesGeometry constructor currently use hash strings? Based on my investigation, it seems that integers can’t be used as keys in objects, and Map support was previously incomplete, but are there other reasons? If there are development histories or policies I’m unaware of, I’d appreciate learning about them.

Hash Generation Method

The current hash generation method I used involves multiplying each value by a small prime number and summing them. Although hash collisions remain a possibility, I considered this method sufficient for the EdgesGeometry use case. If there is a better approach, I’d be grateful for your suggestions.

Please share your thoughts. I’m looking forward to any insights or alternative ideas you may have.

If something could go wrong, it will wait for the best moment to manifest its full glory. This is how the Universe works.

Prime numbers in hashes make sense with integer arithmetic and modulo. With floats they do not provide any practical benefit. For example, your choice of factors produce the same hash value for these two normally looking vertices: (1.0, 1.0, 1.0) and (0.2, 3.0, -0.2)

It might be (slightly) better to use some of the transcendental numbers like pi and e, … because in this case you need quite unusual coordinates to get matching hashes. Some transcendentals, like sqrt(2), are not good, because it is trivial to get a coordinate with such value in procedurally generated geometries.

Also, it might be good to allow the user to set custom factors, to be used when the hardcoded factors fail for some very specific custom geometry.

Alternatively, you could add double checks - a first and very fast check is hash1=hash2, and only when they are equal, then make a more sophisticated check.

2 Likes

@PavelBoytchev

Thank you for the advice!

I understood that the method in the first post easily leads to hash collisions. Therefore, I modified the hash function by referring to the HybridTaus function.

Additionally, I added an option for users to set an arbitrary seed value for the hash function.

Regarding double-checking, I did not adopt it because the processing time became enormous, and the performance worsened compared to string hashing.

Through this work, I learned that a hash function that generates finite output from free input cannot avoid collisions. Therefore, my answer to my own question is that integer hashing cannot be a substitute for string hashing.

Thank you once again for your valuable insights.

2 Likes