Is there anything in JS to detect surfaces, like what ARKit or ARCore do?

Is there anything in JavaScript that can let me detect surfaces, so that I can render objects on them for AR experiences?

1 Like

Since this question is not directly related to three.js, you might have more success when asking it at stackoverflow.

1 Like

I figured that of all groups of WebGL programmers in the web someone in the Three.js community would be most likely to know.

Plus I want to render stuff in Three, so there will be intricacies related to that, but maybe those would be separate questions. On the other hand, maybe some people would be interested to know about this for use in their Three projects?


Okay, sounds valid :blush:

It’s the idea of an augmented reality API to provide the possibility to spatially register virtual objects in the real world. I don’t think you can emulate this with a simple video stream and some JS code. I mean if you want to work without markers but with arbitrary surfaces.

There is the WebXR spec that is available in Chrome Canary and should be coming soon to Chrome on mobile.



It doesn’t quite work the same as ARCore/Kit, instead of detecting planes you can use the Hit Test API to raycast against the world for your anchors.

As it is in Canary at the moment, it can potentially break at any update so I usually prevent auto-updates on it when I get a working version. Also its unlikely to be supported on iOS because they are pushing for AR QuickLook instead.

1 Like

Why not? Would it be too slow in JS?

Interesting! So if I wanted to detect a surface, I’d have to do a raycast sweep across the whole view and generate planes from those points? (it seems wasteful if the underlying engine already has that info)

I don’t know the exact motivations behind why its using hit tests rather than planes. It works more like it detects the mesh of the environment and casts against it. I believe they don’t give you direct access to the mesh for security reasons, which may limit the amount of hit tests you can perform as well.

The spec/system is still being worked on, so this has the potential to change over time.

What use case do you specifically want planes for? The hit test works well for placing items in a room but planes would be better if you were substituting carpets/tiles/paint etc…

It would defeat the purpose if we can’t reliably render things within boundaries of a surface (f.e. water falling off the edge of a table).

I am looking to use web tech (Three.js) to create a demo over the next couple weeks, and reliably detect surfaces and place objects. From the videos I see, looks like that’s possible, so I’ll tinker with it.

So far I’m having no luck getting any WebXR demos working in Chrome Android. I’ve enabled every flag that has “WebXR” in it. Maybe there’s API updates and the demos are outdated, but I haven’t checked the console for errors yet.

Any thoughts on WebXRForArCore? Looks like it isn’t updated much lately.

Its not ideal, but you can get the user to point to the boundaries. As far as I’m aware you can’t just detect a surface yet.

I use Chrome Canary with WebXR and WebXR HitTest flags turned on, and WebVR turned off:
Chrome Android is still a little way off still.

WebXRForArCore was one of the first implementations and I don’t think its been maintained as actively.

1 Like

I think so. All this stuff mentioned in the ARCore documentation like motion tracking, environmental understanding and light estimation is not something you want to implement in JS for performance reasons.

1 Like

3D reconstruction from real-time stereoscopic images using GPU. Hmmmm, maybe it can be adapted for WebGL. :thinking:

@calrk Thanks, Chrome Canary worked!

mmm, here’s a much newer GPU-based 3D reconstruction paper. Looks like we can do it in a worker with OffscreenCanvas by transferring the webcam feed to a worker.

Hey @trusktr did you end up being able to detect surfaces with three.js?

In the meanwhile, there is an official example for the WebXR Hit Test API.


Didn’t get to try it yet.


Somehow related: I’ve looked into PTAM to achieve this. More precisely into this GitHub - williammc/ptam_plus: PTAM (Parallel Tracking and Mapping) re-released under GPLv3.
And compiled it to JS using Emscripten. Now I would need to write JS bindings and see if it works.
There is also the start of a JS implementation here: GitHub - hitsthings/ptamjs: WIP!! Doesn't work yet (WIP and not functional)
Would be great to push forward on this :slight_smile:


Hey I found this, it’s pretty interesting, but I’m not sure how they achieve it:

I tried to search around with the “SLAM” keyword, however what I found is beyond my knowledge :stuck_out_tongue_winking_eye: just post it here to see if someone can give a clue…

yes! well, sort of .
(keep in mind im definitely not a pro in the field)
But yes there are tools that detect certain features of congruent pixels colors, and certain movements (opencv js , boofcv, are good places to start)
but theres no one overall solution, as far as i can tell from my long searching.
you’ve got to bridge the components together to fit your specific needs

i think all thats really needed is

  1. detecting sobel or canny edges.
  2. Hough line detection
    3 optic flow or motion detection capabilities,
    so, find and reinforce lines in your field of view,
    then group and track then. if they move consistently , reinforce that group and get a motion vector on it.

generally speaking, as you move in a direction, objects closer to you, will cross your field of vision quicker than objects far away from you.

I’m still thinking theres a method to resolve planes quicker than by tracking line group movements …
im not sure on the method to derive distance from motion parallax.
and i believe these methods i’ve outlined are called ‘structure from motion’ and ‘dense planar SLAM’

microsoft kinect fusion using the same methods,

i really think the depth mapping only helped refine the small details of structures, but the over all realization of the enviroment is created by joining the over mentioned methods.

I’m working on combining three scripts right now regarding this matter, and found this post researching some snags.
i’d love to share what im working on and if you can give me maybe a day ish, I’ll post what i have for you.

I can’t believe there isn’t a js solution for all this.
yeah, it feels a bit heavy and slow being processed, but still.


Will be awesome to see! I think it would be interesting to port a solution to AssemblyScript (TypeScript that compiles to WebAssembly).

i am so sorry. ive been so busy with things and everytime i sit down to progress my coding i find myself studying more and more efficient options that i would rather build off of now instead refining later. but i do want to add more detail to my post.
I love three.js. but it hasnt offered my many out of the box (or close to it) solutions to meet arcore abilities… but.

1 it has been very useful in building “holoportation” point clouds from colored video depth map streams i recieve of users. Add some octree optimization and im happy. (still need to pose estimate rigs, and fuse them, for each video/angle i receive, for a full 360 point cloud of the users.) but threejs handles that great!.

  1. i prefer three.js by itself for handling the stereo VR through the vrbutton, but AFrame gives me clear GPS data, ARtags, a little more structure, and using SuperHands along with it pretty much covers Physics, and manipulation of objects.
    i find AFrame glitchy though, a little slow, and hinders my access to three.js at times. AND i had to create my own PowerOfTwo canvas video shader in three.js to achieve SeeThruAR for AFrame.

3.(oh boy. smh. gulp)
so, ARcore uses Motion/Point tracking (optic flow?)… and measures how fast it sees pixels move across the screen, in relation to each other. [if you move sideways, and look ahead, objects FAR way, appear to move SLOWER THAN ones that are CLOSER. This is called Motion Parallax.
i believe it finds and ,groups points together, that appear to be moving at similar rates. Then begins to build depth positions for each group according to the different rates.
then also gets an (1) overall average pixel speed on screen and (2) flow direction.
which it checks against (3) Accelerometer and (4) Gyro Data .
By themselves the 4 are not accurate enough, but when they are used together (aka Sensor Fusion) you can begin to plot your xyz location pretty accurately, and get which angle your camera is facing (pose estimation).

three/js offers some vector and matrice maths that can be useful in working all that out BUT…
OpenCV is pretty much the standard for point/motion tracking. focused on c++ they also have some java and js functions.
AND THEN,(new to me) there’s JSFEAT purely focused on JS Computer Vision.
I’ll just jump right in here and let you check out this example . be sure to his train_pattern and if im not mistakened you’re give a refrence frame(big green square) of your location when you trained. (moving the actual camera around should bring it into view)
I do think that can be considered a solid first step for JS based SelfLocalizationAndMapping all in one little script (I’ve looked forever for this. now i want to combine that with optimized canny edge detection, cause its appears a bit slow when working with the full video frame)

and it has a RANSAC motion estimator function and all other kinda goodies.
really, if you have fairly stable zdepth readings, xyz and pose estimation with a dependable refrence frame ln spacetime like that … i do believe you could do most of what ARcore does.

  1. Enviromental Understanding.
    arcore does attempt to find thinks like planar surfaces and squares. like a wall, or table top, and i suspect object detection. then say, if it recognizes a chair, or a table, it will roughly attempt to put a corresponding , basic model in that place. that way, it can interact with it virtually. like it could place a VR pot of gold on your table, and say “look!!! on the table. its gold!!!” then POOF!! , the golds gone, and you could be amazed by ARcores meaness and question if it should remain on your phone or not after that.
    … because instead you could just use this free web based JSFEAT equivalent without the hassle and the tricks.

  2. its aware of lighting, angles and shadows and attempts to rebuild an HDRI skymap light equivilant. how? not sure. probably like reversed “texture baking”?, maybe checks for brightest pixels in 360 within the scene then builds another pot of gold, and takes a guess at lighting it, then see if shadows and angles macth up as predicted? im not sure, but im guessing, we’re ok for now on that.

in conclusion::

point tracking
motion tracking
accel gyro data
ORB tracking,training
canny edge detection
optimize optimize optimize

JSFeat (openCV?)
and of course.
THREE.js (its always been the glue that binds for me. <3)
(… oh, everything will disapear once its outta view, unless you stich scans together with “point registration or ICP” and store that point cloud data somewhere quickly and freely accessible)

but baby steps.
I don’t normally post in forums. like ever… ever.
i don’t know if im the bull in the china shop today, i don’t know if i hurt feelings or broke any belongings. im really just a noob, with a big, passion. i drempt of this stuff when i was 5 and 40 years later im realizing. its finally here, and i CAN actually play a part in it

and ! with JS
and THREE.js

thank you for tolerating me.
1 last question…

i see you… pretty much every day as i search the net for answers to three.js aframe p5.js, opencv , and … the weather.
and your always answering questions. like, ALL of them.
even mine,
because someone else asked it and you answered.

So… like, who are you exactly?
is this a casual, or professional passion?
do you own Mr,Doob?
is anybody FORCING you to answer these questions against your will?

(and some day, i will share my acheivments here, soon. im shy, and still workn)

1 Like