Thinking Slowly
Suppose I wanted to teach a robot to sort a deck of cards. One approach would be to come up with an algorithm for sorting cards and then tell the robot to follow the recipe. A good sorting recipe should work on different sized piles of cards and should stop when it is done and should complete the task efficiently.
But would it be possible to train a robot to do this task just by giving it a reward for succeeding but never telling it what the task is and never telling it HOW it should carry out the task?
To keep things simple, let's imagine we have a very simple world consisting of two cards and four "cells" - two where the cards are laid out and two where the cards can be moved in order to exchange places. There are 12 arrangements with the cards in different cells and four where they end up in the same cell ("collisions").
A | B |
B | A |
A | B |
B | A |
B | |
A |
A | |
B |
B | |
A |
A | |
B |
B | |
A |
B | |
A |
A | |
B |
A | |
B |
Arm Up
Arm Down
AB |
AB | |
AB | |
AB |
Collisions
So we have 16 card states and 8 arm states. Combined that gives 16x8=128 states.
One is the start state and one is the end state.
The system moves from state to state via one of 6 possible actions:
UP
DOWN
EAST
WEST
NORTH
SOUTH
B | A |
A | B |
start
end
B | A |
start
B | A |
B | |
A |
B | |
A |
down
north
up
B | |
A |
east
B | |
A |
south
down
B | |
A |
west
B | |
A |
up
B | |
A |
north
B | |
A |
down
B | |
A |
east
B | |
A |
south
A | B |
up
A | B |
Instruct a Robot to Find the Largest Card
The robot can only
Point at the cards as card1, card2, etc.
Read the value of a card.
Remember things.
Compare things.
"Teach" a Robot to Sort Cards
"Teach" a Robot to Sort Cards
A robot arm hovers above a table
It can move in four directions - north, south, east, west
and can go down and touch table or go up and hover above
"Teach" a Robot to Sort Cards
down
"Teach" a Robot to Sort Cards
UP
"Teach" a Robot to Sort Cards
EAST
"Teach" a Robot to Sort Cards
WEST
"Teach" a Robot to Sort Cards
NORTH
"Teach" a Robot to Sort Cards
SOUTH
"Teach" a Robot to Sort Cards
HOME
The arm can go to its "HOME" position
"Teach" a Robot to Sort Cards
HOME
The arm can go to its "HOME" position
"Teach" a Robot to Sort Cards
HOME
A>B?
and it can COMPARE the card it is hovering over and the card to the right
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
A>B?
"Teach" a Robot to Sort Cards
HOME
YES
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
A>B?
"Teach" a Robot to Sort Cards
HOME
NO
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
A>B?
"Teach" a Robot to Sort Cards
HOME
YES
"Teach" a Robot to Sort Cards
HOME
YES
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
HOME
"Teach" a Robot to Sort Cards
D
"Teach" a Robot to Sort Cards
D, N
"Teach" a Robot to Sort Cards
D, N, E
"Teach" a Robot to Sort Cards
D, N, E, U
"Teach" a Robot to Sort Cards
D, N, E, U, S
"Teach" a Robot to Sort Cards
D, N, E, U, S, D
"Teach" a Robot to Sort Cards
D, N, E, U, S, D, W
"Teach" a Robot to Sort Cards
D, N, E, U, S, D, W, U
"Teach" a Robot to Sort Cards
D, N, E, U, S, D, W, U, N
"Teach" a Robot to Sort Cards
D, N, E, U, S, D, W, U, N, W
"Teach" a Robot to Sort Cards
D, N, E, U, S, D, W, U, N, W, D
"Teach" a Robot to Sort Cards
D, N, E, U, S, D, W, U, N, W, D, S
"Teach" a Robot to Sort Cards
D, N, E, U, S, D, W, U, N, W, D, S, U
Thinking Slowly: Robot Card Sorter
By Dan Ryan
Thinking Slowly: Robot Card Sorter
How learning to think slowly is the first step in computational reasoning.
- 69