I am very happy that I got the opportunity to work on a website that is gesture-based.
I have a few inspiration for this: link
I visited lot of websites and googled it, Wikipedia and gitHub also didn't help much. There is not much information provided as these technologies are in nascent stages.
I think I will have to use some js for this project
- gesture.js (our custom javascript code)
- reveal.js (Frame work for slideshow)
My questions are how come gestures generate events, how does my JavaScript interact with my webcam? Do I have to use some API or algorithms?
I am not asking for code. I am just asking the mechanism, or some links providing vital info will do. I seriously believe that if the accuracy on this technology can be improved, this technology can do wonders in the near future.
To enable gestural interactions in a web app, you can use navigator.getUserMedia() to get video from your local webcam, periodically put video frame data into a canvas element and then analyse changes between frames.
There are several JavaScript gesture libraries and demos available (including a nice slide controller). For face/head tracking you can use libraries like headtrackr.js: example at simpl.info/headtrackr.
I'm playing a little bit with that at the moment so, from what I understood
the most basic technique is:
- you request to use the user's webcam to take a video.
- when permission is given, create a canvas in which to put the video.
- you use a filter (black and white) on the video.
- you put some control points in the canvas frame (a small area in where all the pixel colors in it are registered)
- you start attaching a function for each frame (for the purpose of the explanation, I'll only demonstrate left-right gestures)
At each frame:
- If the frame is the first (F0) continue
- If not: we subtract the current frame's pixels (Fn) from the previous one
- if there were no movement between Fn and F(n-1) all the pixels will be black
- if there are, you will see the difference Delta = Fn - F(n-1) as white pixels
- Then you can test your control points for which areas are light up and store them
( ** )x = DeltaN
Repeat the same operations until you have two or more Deltas variables and you subtract the control points DeltaN from the control points Delta(n-1) and you'll have a vector
- ( **)x = DeltaN
- ( ** )x = Delta(N-1)
- ( +2 )x = DeltaN - Delta(N-1)
You can now test if the vector is either positive or negative, or test if the values are superior to some value of your choosing
if positive on x and value > 5
and trigger an event, then listen to it:
$(document).trigger('MyPlugin/MoveLeft', values)
$(document).on('MyPlugin/MoveLeft', doSomething)
You can greatly improve the precision by caching the vectors or adding them and only trigger an event when the vector values becomes a sensible value.
You can also expect a shape on your first subtractions and try to map a "hand" or a "box"
and listen to the changes of the shape's coordinates, but remember the gestures are in 3D and the analysis is 2D so the same shape can change while moving.
Here's a more precise explanation. Hope my explanation helped.