Computer vision in the browser

C++

Python

Java...

AreWeFastYet.com

CCV

-C language

http://libccv.org/

Commonly used for facial and object detection

Very well known

Uses the C language

Wrappers exist in other languages

openCV.org

Python wrapper for Open CV

1. Using the Canvas element, you can analyze images in the browser

 <canvas></canvas>

$('#example').mousemove(function(e) {
    var pos = findPos(this);
    var x = e.pageX - pos.x;
    var y = e.pageY - pos.y;
    var coord = "x=" + x + ", y=" + y;
    var c = this.getContext('2d');

    // MAGIC HAPPENS HERE 
    var p = c.getImageData(x, y, 1, 1).data; 
    

    var hex = "#" + ("000000" + rgbToHex(p[0], p[1], p[2])).slice(-6);
    $('#status').html(coord + "
" + hex);
});

Filters.grayscale = function(pixels, args) {
  var d = pixels.data;
  for (var i=0; i<d.length; i+=4) {
    var r = d[i];
    var g = d[i+1];
    var b = d[i+2];
    // CIE luminance for the RGB
    // The human eye is bad at seeing red and blue, so we de-emphasize them.
    var v = 0.2126*r + 0.7152*g + 0.0722*b;
    d[i] = d[i+1] = d[i+2] = v
  }
  return pixels;
};

http://harthur.github.io/kittydar/

Kittydar takes an image (canvas) and tells you the locations of all the cats in the image:

var cats = kittydar.detectCats(canvas);

console.log("there are", cats.length, "cats in this photo");
console.log(cats[0]);

// { x: 30, y: 200, width: 140, height: 140 }

2. Using WebRTC and GetUserMedia() you can manipulate real-time images

jsfeat

http://inspirit.github.io/jsfeat/

http://auduno.github.io/clmtrackr/clm_image.html
<canvas id="inputCanvas" width="320" height="240" style="display:none"></canvas>
<video id="inputVideo" autoplay loop></video>
<script type="text/javascript">
  var videoInput = document.getElementById('inputVideo');
  var canvasInput = document.getElementById('inputCanvas');

  var htracker = new headtrackr.Tracker();
  htracker.init(videoInput, canvasInput);
  htracker.start();
</script>

Kittydar.js

CLMTracker.js

ConvNet.js

Cool things

- Photos
	- face detection https://github.com/davidsandberg/facenet
	- real time https://github.com/cmusatyalab/openface
- Sound
	- sound detection http://projects.csail.mit.edu/soundnet/
- Video
	- scene detection http://pyscenedetect.readthedocs.io/en/latest/examples/usage-example/
	- real time http://pjreddie.com/darknet/yolo/
- Maps
	- driving http://www.cs.princeton.edu/~aseff/mapnet/
- Great resources
	- Tensorflow for poets - https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0
	- Deep Learning book
	- Pygurus
- Pointers
	- AWS GPU units
	- Awesome AMI - https://deepdetect.com/
	- Executing stuff on AWS Spot instances

Computer Vision in the Browser

By Leonard Bogdonoff

Computer Vision in the Browser

Presented on 4/16/14 for NYC HTML5 app meetup at Conde Nast.

3,774

Computer vision in the browser

C++

AreWeFastYet.com

CCV

-C language

http://libccv.org/

Commonly used for facial and object detection

Python wrapper for Open CV

1. Using the Canvas element, you can analyze images in the browser

Kittydar takes an image (canvas) and tells you the locations of all the cats in the image: var cats = kittydar.detectCats(canvas); console.log("there are", cats.length, "cats in this photo"); console.log(cats[0]); // { x: 30, y: 200, width: 140, height: 140 }

2. Using WebRTC and GetUserMedia() you can manipulate real-time images

jsfeat

Cool things

Computer Vision in the Browser

More from Leonard Bogdonoff

`Kittydar takes an image (canvas) and tells you the locations of all the cats in the image: var cats = kittydar.detectCats(canvas); console.log("there are", cats.length, "cats in this photo");console.log(cats[0]); // { x: 30, y: 200, width: 140, height: 140 }`