Image recognition and camera positioning with OpenCV.

A tourist guide application

Francesco Nazzaro

it.linkedin.com/in/fnazzaro

f.nazzaro@bopen.eu

bopen.eu

Image recognition issues:

We have to implement a human ability!

The images to compare can be distorted and oriented in different ways!

We have to detect individual features.

The detection must be scale invariant.

Google Images: why not?

Google Goggles doesn't have a public API!

Features detection

A - flat surface

B - edge

C - corner

Corners have an orientation!

We need a corner detection algorithm!

The algorithm must be scale invariant!

Scale Invariant Feature Transform algorithm by Lowe

Difference of Gaussians (DoG) passed on the image to detect scale invariant blobs. Blobs are the extrema of the DoG distribution.
An algorithm like Harris' corner detector is used to leave out edges, and to keep corners.
An orientation is assigned to each feature to achieve rotation invariance.
For each keypoint a descriptor vector is created
Keypoints matching through nearest neighbor algorithm

David G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, Volume 60 Issue 2, November 2004, Pages 91-110

Image recognition with OpenCV

and IPython Notebook

%pylab inline
import cv2

Import b/w image

imshow(image, cmap='gray')

Extract keypoints and descriptors

sift = cv2.SIFT()
key_points, descriptors = sift.detectAndCompute(image, None)
imshow(cv2.drawKeypoints(image, key_points))

Same process for library image

key_points_lib, descriptors_lib = sift.detectAndCompute(lib_image, None)
imshow(cv2.drawKeypoints(lib_image, key_points_lib))

Match descriptors

flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(desc, desc_lib, k=2)
good = [m for m, n in matches if m.distance < 0.7 * n.distance]
cv2.drawMatches(image, kp, lib, kp_lib, good)

Recognized

156 matches

Not recognized

35 matches

SIFT algorithm

Recognized

93 matches

Not recognized

21 matches

SURF algorithm

Pose estimation

Pose estimation with homography

p^a = K^a \cdot M_{ba} \cdot K_b \cdot p^b

M_{ba} = R - \frac{tn^T}{d}

We have two cameras and , looking at point in a plane.

The projections of in and are respectively and

p^a

p^b

Where the homography matrix is

M_{ba}

is the rotation matrix between and .
is the translation vector.
and are the normal vector of the plane and the distance to the plane respectively.
and are the cameras' intrinsic parameter matrices

K_a

K_b

We have to compute camera intrinsic parameter matrices

Calibration is performed through chessboard method.

Photographing a chessboard from different angles, searching the corners and forcing them to lies on a straight line through a distortion.

We can compute the intrinsic parameter of the camera with the OpenCV functions cv2.findChessboardCorners and cv2.calibrateCamera.

Starting from these parameters we can apply a warp to the image.

cv2.findChessboardCorners()

cv2.calibrateCamera()

Title Text

image_matches = np.array([ key_points[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
lib_matches = np.array([ key_points_lib[m.trainIdx].pt for m in good ]).reshape(-1,1,2)

M, _ = cv2.findHomography(image_matches, lib_matches, cv2.RANSAC, 5.0)

h, w = image_lib.shape
pic_pts = np.array([[0, 0],[0, h - 1], [w - 1, h - 1], [w - 1, 0]]).reshape(-1, 1, 2)
distorted = cv2.perspectiveTransform(pic_pts, M)

plot(distorted[:, 0, 0], distorted[:, 0, 1], marker='.', c='red')
cv2.drawMatches(image, key_points, image_lib, key_points_lib, good, **draw_params)

Now we can compute the position of the picture in the image

Picture positioning wrong!

False positive!

wsize = hsize / h * w
objp = np.array(
    ((0., 0., 0.), (0., hsize, 0.), (wsize, hsize, 0.), (wsize, 0., 0.)), dtype=np.float32
).reshape(4, 1, 3)
_, rvecs, tvecs = cv2.solvePnP(objp, distorted, mtx, dist)
R, _ = cv2.Rodrigues(rvecs)
translation = R.T.dot(-tvecs)

To compute the position of the observer we have to extract rotation and translation from the homography

M_{ba} = R - \frac{tn^T}{d}

mtx

are the camera intrinsic parameters

dist

and

3D

Let's start with a picture of the Costantine arc, taken from the right

Image recognition doesn't work. There are too many differences between the images.

We need 3 images in the library: from left, right and center. The algorithm recognizes the right one

We used image recognition in a Google Glass application

It is a tourist guide application for Google Glass.

It plays media contents based on your localization

cofinanced with contribution POR/FESR Regione Lazio

2007 – 2013 Asse I – Avviso Pubblico Insieme x Vincere

Image recognition is used for:

advanced localization

emission of advanced information on what you're watching

Thank you!

Francesco Nazzaro

it.linkedin.com/in/fnazzaro

f.nazzaro@bopen.eu

bopen.eu

Any questions?

Image recognition and camera positioning with OpenCV. A tourist guide application - Europython 2015

By Francesco Nazzaro

Image recognition and camera positioning with OpenCV. A tourist guide application - Europython 2015

OpenCV Python bindings provide several ready to use tools for camera calibration, image recognition and camera position estimation. This talk will show how to recognize a picture, from a library of known paintings, and compute the camera position with respect to the recognized picture using OpenCV and numpy. This is applied to a tourist guide application for Google Glass through the recognition of the paintings exposed in the museum.

2,406

Image recognition issues:

Google Images: why not?

Google Goggles doesn't have a public API!

Features detection

Corners have an orientation! We need a corner detection algorithm! The algorithm must be scale invariant!

Scale Invariant Feature Transform algorithm by Lowe

Image recognition with OpenCV

Import b/w image

Extract keypoints and descriptors

Same process for library image

Match descriptors

Recognized

Not recognized

SIFT algorithm

Recognized

Not recognized

SURF algorithm

Pose estimation

Pose estimation with homography

We have to compute camera intrinsic parameter matrices

We can compute the intrinsic parameter of the camera with the OpenCV functions cv2.findChessboardCorners and cv2.calibrateCamera.

Starting from these parameters we can apply a warp to the image.

Title Text

Now we can compute the position of the picture in the image

Picture positioning wrong!

False positive!

To compute the position of the observer we have to extract rotation and translation from the homography

3D

Let's start with a picture of the Costantine arc, taken from the right

Image recognition doesn't work. There are too many differences between the images.

We need 3 images in the library: from left, right and center. The algorithm recognizes the right one

We used image recognition in a Google Glass application

Image recognition is used for:

Any questions?

Image recognition and camera positioning with OpenCV. A tourist guide application - Europython 2015

More from Francesco Nazzaro

Corners have an orientation!

We need a corner detection algorithm!

The algorithm must be scale invariant!