Image recognition and camera positioning with OpenCV.
A tourist guide application
Francesco Nazzaro
f.nazzaro@bopen.eu
We have to implement a human ability!
The images to compare can be distorted and oriented in different ways!
We have to detect individual features.
The detection must be scale invariant.
C - corner
David G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, Volume 60 Issue 2, November 2004, Pages 91-110
and IPython Notebook
%pylab inline
import cv2
imshow(image, cmap='gray')
sift = cv2.SIFT()
key_points, descriptors = sift.detectAndCompute(image, None)
imshow(cv2.drawKeypoints(image, key_points))
key_points_lib, descriptors_lib = sift.detectAndCompute(lib_image, None)
imshow(cv2.drawKeypoints(lib_image, key_points_lib))
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(desc, desc_lib, k=2)
good = [m for m, n in matches if m.distance < 0.7 * n.distance]
cv2.drawMatches(image, kp, lib, kp_lib, good)
156 matches
35 matches
93 matches
21 matches
We have two cameras and , looking at point in a plane.
The projections of in and are respectively and
Where the homography matrix is
Calibration is performed through chessboard method.
Photographing a chessboard from different angles, searching the corners and forcing them to lies on a straight line through a distortion.
cv2.findChessboardCorners()
cv2.calibrateCamera()
image_matches = np.array([ key_points[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
lib_matches = np.array([ key_points_lib[m.trainIdx].pt for m in good ]).reshape(-1,1,2)
M, _ = cv2.findHomography(image_matches, lib_matches, cv2.RANSAC, 5.0)
h, w = image_lib.shape
pic_pts = np.array([[0, 0],[0, h - 1], [w - 1, h - 1], [w - 1, 0]]).reshape(-1, 1, 2)
distorted = cv2.perspectiveTransform(pic_pts, M)
plot(distorted[:, 0, 0], distorted[:, 0, 1], marker='.', c='red')
cv2.drawMatches(image, key_points, image_lib, key_points_lib, good, **draw_params)
wsize = hsize / h * w
objp = np.array(
((0., 0., 0.), (0., hsize, 0.), (wsize, hsize, 0.), (wsize, 0., 0.)), dtype=np.float32
).reshape(4, 1, 3)
_, rvecs, tvecs = cv2.solvePnP(objp, distorted, mtx, dist)
R, _ = cv2.Rodrigues(rvecs)
translation = R.T.dot(-tvecs)
mtx
are the camera intrinsic parameters
dist
and
It is a tourist guide application for Google Glass.
It plays media contents based on your localization
cofinanced with contribution POR/FESR Regione Lazio
2007 – 2013 Asse I – Avviso Pubblico Insieme x Vincere
Thank you!
Francesco Nazzaro
f.nazzaro@bopen.eu