I Believe I can flaP

Raúl G. Roa Gómez

How to play Flappy Bird and never* lose

Disclaimer

Who's this guy

Lead Software Architect

WE MAKE COMPUTERS DO AMAZING THINGS...

PUT STUFF WHERE THEY BELONG

MAKE THEM UNDERSTAND

I DO MORE

Business & Technical Consultant at ITSS GLOBAL
Banking technology solutions and services to banks and financial institutions

Technical Adviser & Business partner at Digital Reality
Virtual Reality and augmented reality for developing countries.

 

Technical Adviser at Yoyo

Payment gateway for unbanked individuals

 

Minor OSS contributor

DevIL, ResIL, fog, emscripten

 

A LITTLE BIT MORE...

Full Stack Developer (10+ years)
C/C++, Python, Rust, MCP, MCAD and MCSD

2D/3D Game Developer (5+ years)
C/C++, UnrealScript, Lua, GLSL, HLSL, Unreal Engine, Unity


Adjunct Lecturer PUCMM (RSTA, STI) (7 years)
Software Engineering, Programming, Data Structures
 

DCGames (2003 ― 2005)
Game performance metrics for CS 1.6, Unreal Tournament, Quake 3 and Warcraft: Frozen Throne. Biggest video game related site in the Caribbean. Subsequently acquired by Verizon in 2005.

 

what's this talk about?

INTELLIGENCE
/ꞮNˈTƐLꞮDƷ(Ə)NS/

THE ABILITY TO ACQUIRE AND APPLY KNOWLEDGE AND SKILLS.

COMPUTERS

+

INTELLIGENCE

=
ARTIFICIAL INTELLIGENCE

WE ARE ADDING MORE COWBELL

MACHINE LEARNING

machine-learning (ML) is one IMPLEMENTATION of Artificial Intelligence (A.I.), It is a statistical and data-driven approach to creating A.I.

WE WILL DEMONSTRATE THE LATER, BY ANSWERING ONE IMPORTANT QUESTION

to flap
or
to not flap?

But how?

REINFORCEMENT LEARNING

is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.

van Otterlo, M.; Wiering, M. (2012)

REINFORCEMENT LEARNING

REINFORCEMENT LEARNING

Specifically...

Q-LEARNING

a technique that evaluates which action to take based on an action-value function that determines the value of being in a certain state and taking a certain action in that state.

Q-LEARNING

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Q-LEARNING

Q(1, 5)

Q(1, 5) =
R(1, 5) + 0.8 * Max
[Q(5, 1), Q(5, 4), Q(5, 5)] = 100 + 0.8 * 0 = 100

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Q(1, 5)

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

RINSE AND REPEAT

http://mnemstudio.org/path-finding-q-learning-tutorial.htm

UPDATED Q-MATRIX

eeeeeeeeh?

Please show me something!

  • Asynchronous Methods for Deep Reinforcement Learning
    https://goo.gl/v6QWNY
     
  • What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?
    https://goo.gl/3H6vss

FURTHER READING

I believe I can flap

By Raúl G. Roa Gómez

I believe I can flap

  • 347