3D Object Recognition By Using Google Tango Project And Creating Virtual World
Virtual Cafe
Wan
Chatchawan Yoojuie
Benz
Natthakul Boonmee
Top
Kanin Kunapermsiri
3D Object Recognition By Using Google Tango Project And Creating Virtual World
Dr. Kwankamol Nongpong
Senior Project
Semester 2/2016
Introduction
1
Create indoor navigator (without using GPS)
Create accurate measurement tools
Create augmented reality game
Tango
2
3D object detection
Create realistic augmented reality application
Tango
3
Note: The environments and detected object will be unmovable.
Goal
4
Google Platform
for capturing image and position
for 3D image processing
for rendering virtual world and virtual object
Tools
5
Supported by Google Tango
Tools
6
7
Tools
8
Tools
9
Tools
10
Tools
Point cloud is a set of points in the 3D coordinates system
It represents as 3D image and each point in the image contain x, y and z value
The game engine which provides all necessary tools
Coding in C#
Use for rendering 3D objects and virtual world
11
Tools
Android phone that supports Google Tango Platform
Equipped with IR sensor for capturing point cloud
12
Tools
13
Framework
1. Area Mapping (Creating a room)
The design framework can be divided into two parts :-
2. Training Dataset and Object Recognition
14
Framework
Make application remember the room by scanning around the room
Save into ADF file
15
Framework
Measure the actual room size by marking corners and look for distance
Save into XML file
16
Framework
Load the ADF file that we saved in area learning part along with the XML file that contains all the vertices representing the corners of the room
Use Unity to render previous data
into virtual room
17
Framework
1. Area Mapping (Creating a room)
The design framework can be divided into two parts :-
2. Training Dataset and Object Recognition
18
Framework
Create datasets that will be used in matching step
Recognize the object with the dataset
6DOF pose estimation of the detected object
Display object model in Unity
The centroid is a point of the result by calculating the mean value of all points in the cloud
19
It is a " Center of Mass "
Framework
Six degrees of freedom (6DoF) refers to the position of the object in 3D dimension space which is described with translation and rotation
20
Framework
With ground-truth information of the object in unity coordinate system
It contains these information :-
1. Translation of the device (Vector format)
2. Rotation of the device (Quaternion format)
3. Translation of the object (Vector format)
4. Rotation of the object (Quaternion format)
21
Framework
0 0 0
-0.259 0.001 -0.004 -0.966
-0.027 -0.082 0.0754
0 0.966 -0.259 0
Feature extraction that encodes the information about the point cloud
Basically, there are two types of descriptor in PCL :-
1. Local - computed for individual points
2. Global - computed for the whole cluster that represents an object
22
Framework
VFH (Viewpoint Feature Histogram)
23
Framework
Tripod
Pan-tile
24
Framework
25
Framework
Structure of the dataset :-
1. Object Snapshot ( .PCD )
2. Descriptor ( .PCD )
3. Ground-Truth ( .TXT )
26
Framework
The design framework can be divided into two parts :-
2. Training Dataset and Object Recognition
27
Framework
Capture point cloud and send to the server via socket
Then, follow the process of Global Pipeline
28
Framework
The global pipeline contains 4 steps
29
Framework
Perform segmentation on the cloud in order to retrieve all possible clusters on the plane surface
30
Framework
For every cluster that has survived in the segmentation step, a global descriptor must be computed
31
Framework
Use the descriptor to perform a search for their nearest neighbors in the database
32
Framework
33
Framework
34
Framework
The output of the global pipeline will be sent back to the device
Output is in XML format
Some calculation needed to extract those pose estimation of the object and display in Unity
35
Framework
These are 5 pieces of information extracted from the output:
(1) DR = Unity ground-truth of D : device rotation in Quaternion format
(2) OR = Unity ground-truth of D : object rotation in Quaternion format
(3) ICP = ICP from S to D : transformation in Matrix format
(4) SC = Centroid of S in Vector format
(5) DCO = Database centroid offset, the offset of the centroid between SC(4) and Unity ground-truth of D : object centroid(6) in Vector format
1
2
3
D
S
4
6
D
Unity Coordinate
PCL Coordinate
36
Framework
Use Unity to render the detected object according to the data that extracted from previous
37
Flow
38
Evaluation
Testing for how well it can get the correct pose
Use a single white rectangle box for both testing
39
Training dataset :
Evaluation
40
Testing dataset :
Evaluation
41
Total of tested scenes :
Evaluation
42
Captured At Training
Evaluation
43
Evaluation
44
Dataset 1 At 0.9 metre
Dataset 2 At 1.5 metre
The performance from dataset 2 is significantly drops compared to dataset 1
The threshold value of the matching is too large
The quality and detail of the point cloud changed according to distance
Evaluation
45
Distance of the object at the training stage have a huge impact on the accuracy of the recognition system
At 0.5 metre is slight lower performance than at 1.0 metre.
46
Challenges
47
Imporvement