Michael Holroyd
Computer Graphics Ph.D.
CS6501 - Feb 2015
Could try to triangulate every pixel in a camera...
What if you don't know where the cameras are located?
5 unknowns for each camera
(x,y,z) position and (θ,φ) rotation
called the camera extrinsics
What if you don't know how 3D points (x,y,z) map to
2D points (x,y) in the image?
5 more unknowns for each camera:
focal point (f), principal point (cx,cy),
radial distortion (polynomial approximation with 3 parameters)
called the camera intrinsics
So the entire problem boils down to finding corresponding points in multiple images.
What makes a "good feature"?
How to compare two similar features?
Pixel-by-pixel metrics only work for EXACT matches
Many feature descriptors have been proposed to achieve invariance to scale, rotation, lighting, etc.
Could probably be its own talk...
Compute histograms of gradients and rotate everything into a consistent frame. Slow, but first "effective" descriptor.
Approximate gradient responses with fast integer filters.
Can precompute an integral image and compute
filter response in O(1) time.
Faster and works about as well as SIFT.
Make a bunch of pseudo-random pixel comparisons and create a big binary vector.
SIFT/SURF is a very big descriptor (128 or 64 floats typically). BRIEF descriptor gets comparable performance with 16 bytes.
Also much faster to compare two features.
Many chips have a built-in XOR + bitcount instruction.
2010 - BRIEF descriptor was first on the scene, but did not handle rotation invariance.
2011 - ORB (Oriented FAST and Rotated BRIEF) showed how to handle rotation invariance and might be considered state-of-the-art in realtime-ish descriptors.
2011 - BRISK (Binary Robust Invariant Scalable Keypoints) adds scale invariance by considering pyramid of images at different scales. Probably the best descriptor out right now carefully tested.
2012 - FREAK (Fast Retina Keypoint) uses a creative sampling pattern loosely based off spatial encoding in the human retina.
2013+ Attention has turned toward machine learning approaches to feature descriptors, but no real success yet.
In other words, come up with a clever acronym and design yet another feature descriptor.
Auto-differentiation: take all instances of float, and replace them with type Jet that tracks the differentiation via chain rule.
open Dropbox/2015_01_10_Stadhuis/
By Michael Holroyd
Brief introduction to the concepts behind the structure-from-algorithm for finding camera pose and sparse 3D geometry.