Computer vision in the browser





 














C++


Python

Java...

W3C





AreWeFastYet.com







CCV


-C language

http://libccv.org/


Commonly used for facial and object  detection

 
Very well known
Uses the C language
Wrappers exist in other languages

openCV.org


Python wrapper for Open CV






1. Using the Canvas element, you can analyze images in the browser    


 <canvas></canvas>
 
$('#example').mousemove(function(e) {
    var pos = findPos(this);
    var x = e.pageX - pos.x;
    var y = e.pageY - pos.y;
    var coord = "x=" + x + ", y=" + y;
    var c = this.getContext('2d');

    // MAGIC HAPPENS HERE 
    var p = c.getImageData(x, y, 1, 1).data; 
    

    var hex = "#" + ("000000" + rgbToHex(p[0], p[1], p[2])).slice(-6);
    $('#status').html(coord + "
" + hex); });  

Filters.grayscale = function(pixels, args) {
  var d = pixels.data;
  for (var i=0; i<d.length; i+=4) {
    var r = d[i];
    var g = d[i+1];
    var b = d[i+2];
    // CIE luminance for the RGB
    // The human eye is bad at seeing red and blue, so we de-emphasize them.
    var v = 0.2126*r + 0.7152*g + 0.0722*b;
    d[i] = d[i+1] = d[i+2] = v
  }
  return pixels;
};
                                        


http://harthur.github.io/kittydar/

Kittydar takes an image (canvas) and tells you the locations of all the cats in the image:

var cats = kittydar.detectCats(canvas);

console.log("there are", cats.length, "cats in this photo");
console.log(cats[0]);

// { x: 30, y: 200, width: 140, height: 140 }



2. Using WebRTC and GetUserMedia() you can manipulate real-time images





jsfeat


http://inspirit.github.io/jsfeat/




http://auduno.github.io/clmtrackr/clm_image.html

<canvas id="inputCanvas" width="320" height="240" style="display:none"></canvas>
<video id="inputVideo" autoplay loop></video> <script type="text/javascript"> var videoInput = document.getElementById('inputVideo'); var canvasInput = document.getElementById('inputCanvas'); var htracker = new headtrackr.Tracker(); htracker.init(videoInput, canvasInput); htracker.start(); </script>







Kittydar.js
CLMTracker.js
ConvNet.js

Cool things

- Photos
	- face detection https://github.com/davidsandberg/facenet
	- real time https://github.com/cmusatyalab/openface
- Sound
	- sound detection http://projects.csail.mit.edu/soundnet/
- Video
	- scene detection http://pyscenedetect.readthedocs.io/en/latest/examples/usage-example/
	- real time http://pjreddie.com/darknet/yolo/
- Maps
	- driving http://www.cs.princeton.edu/~aseff/mapnet/
- Great resources
	- Tensorflow for poets - https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0
	- Deep Learning book
	- Pygurus
- Pointers
	- AWS GPU units
	- Awesome AMI - https://deepdetect.com/
	- Executing stuff on AWS Spot instances

Computer Vision in the Browser

By Leonard Bogdonoff

Computer Vision in the Browser

Presented on 4/16/14 for NYC HTML5 app meetup at Conde Nast.

  • 3,752