Thinking Slowly

Suppose I wanted to teach a robot to sort a deck of cards.  One approach would be to come up with an algorithm for sorting cards and then tell the robot to follow the recipe. A good sorting recipe should work on different sized piles of cards and should stop when it is done and should complete the task efficiently.

 

But would it be possible to train a robot to do this task just by giving it a reward for succeeding but never telling it what the task is and never telling it HOW it should carry out the task?

 

To keep things simple, let's imagine we have a very simple world consisting of two cards and four "cells" - two where the cards are laid out and two where the cards can be moved in order to exchange places.  There are 12 arrangements with the cards in different cells and four where they end up in the same cell ("collisions").

A B
B A
A B
B A
B
A
A
B
B
A
A
B
B
A
B
A
A
B
A
B

Arm Up

Arm Down

AB
AB
AB
AB

Collisions

So we have 16 card states and 8 arm states. Combined that gives 16x8=128 states.

One is the start state and one is the end state.

The system moves from state to state via one of 6 possible actions:
UP

DOWN

EAST
WEST
NORTH

SOUTH

B A
A B

start

end

B A

start

B A
B
A
B
A

down

north

up

B
A

east

B
A

south

down

B
A

west

B
A

up

B
A

north

B
A

down

B
A

east

B
A

south

A B

up

A B

Instruct a Robot to Find the Largest Card

The robot can only

Point at the cards as card1, card2, etc.

Read the value of a card.

Remember things.

Compare things.

"Teach" a Robot to Sort Cards

"Teach" a Robot to Sort Cards

A robot arm hovers above a table

It can move in four directions - north, south, east, west

and can go down and touch table or go up and hover above

"Teach" a Robot to Sort Cards

down

"Teach" a Robot to Sort Cards

UP

"Teach" a Robot to Sort Cards

EAST

"Teach" a Robot to Sort Cards

WEST

"Teach" a Robot to Sort Cards

NORTH

"Teach" a Robot to Sort Cards

SOUTH

"Teach" a Robot to Sort Cards

HOME

The arm can go to its "HOME" position

"Teach" a Robot to Sort Cards

HOME

The arm can go to its "HOME" position

"Teach" a Robot to Sort Cards

HOME

A>B?

and it can COMPARE the card it is hovering over and the card to the right

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

A>B?

"Teach" a Robot to Sort Cards

HOME

YES

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

A>B?

"Teach" a Robot to Sort Cards

HOME

NO

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

A>B?

"Teach" a Robot to Sort Cards

HOME

YES

"Teach" a Robot to Sort Cards

HOME

YES

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

HOME

"Teach" a Robot to Sort Cards

D

"Teach" a Robot to Sort Cards

D, N

"Teach" a Robot to Sort Cards

D, N, E

"Teach" a Robot to Sort Cards

D, N, E, U

"Teach" a Robot to Sort Cards

D, N, E, U, S

"Teach" a Robot to Sort Cards

D, N, E, U, S, D

"Teach" a Robot to Sort Cards

D, N, E, U, S, D, W

"Teach" a Robot to Sort Cards

D, N, E, U, S, D, W, U

"Teach" a Robot to Sort Cards

D, N, E, U, S, D, W, U, N

"Teach" a Robot to Sort Cards

D, N, E, U, S, D, W, U, N, W

"Teach" a Robot to Sort Cards

D, N, E, U, S, D, W, U, N, W, D

"Teach" a Robot to Sort Cards

D, N, E, U, S, D, W, U, N, W, D, S

"Teach" a Robot to Sort Cards

D, N, E, U, S, D, W, U, N, W, D, S, U

Thinking Slowly: Robot Card Sorter

By Dan Ryan

Thinking Slowly: Robot Card Sorter

How learning to think slowly is the first step in computational reasoning.

  • 76