Tampa Devs April 2023
David Khourshid ยท @davidkpiano
stately.ai
The path to generally intelligent software
State machines
AI
to




Command Palette
Command palette
Find in UI
Click
Click
Click
Type
Type
Success
Success
ChatGPT & GPT-3
4
Input
Output
Input
Output
Goal
Start
Symbolic AI
Symbols
Solution
๐ฆ Large datasets
โ Guesswork
๐ฃ No semantics
๐ Real world is complicated
State machine AI
Wander maze
Chase Pac-Man
Run away from Pac-Man
Return to base
Lose Pac-Man
See Pac-Man
Pac-Man eats
power pill ๐
Power pill
wears off
Pac-Man eats
power pill
Eaten by
Pac-Man
Reach
base
State machine AI

Neural networks / Deep learning
Artificial General Intelligence
(AGI)
๐ง ย ย Mental model
๐ย ย Perception of time
ย
๐ย ย Imagination
ย
AGI
Artificial Intelligence
Machine learning
Deep learning
LLMs
Now
Many years away
Agent
Environment
Reward
State
Action
Reinforcement Learning
Environment


Normal mode

Scatter mode
๐ญ Observes state
of environment
Agent

๐ Takes action
Agent



Policy

Some state
Desired
state
โฌ๏ธ
โ
โฌ ๏ธ
โ
โก๏ธ
โ
โฌ๏ธ
Reward

How good was
the action?
๐ ๐
Reward

Value function
๐ก + ๐ = ๐
๐ก + ๐ป = ๐
Model

Shortcomings
๐ Learning
๐ Planning
Needs lots of trials
Combinatorial state explosion (multidimensionality)
Simulation โ real life
There will be consequences
Needs lots of trials
Sparse rewards
Explore vs. exploit
Arbitrary value function
Exploitation
Exploration
Exploration vs. exploitation
Explore 100% = learns nothing!
Exploit 100% = learns nothing!
no reward
Do nothing
Owner has treat
Tells dog "down"
ENVIRONMENT
Nothing changes
ENVIRONMENT
Exploration
reward++
Go down
Get treat
Owner has treat
Tells dog "down"
ENVIRONMENT
Owner praises dog
ENVIRONMENT
Exploration
Go down
Owner has treat
Tells dog "down"
ENVIRONMENT
Owner praises dog
ENVIRONMENT
Exploitation
๐ญ
Treat?
Run away
Owner has treat
Tells dog "down"
ENVIRONMENT
Undesired outcome
ENVIRONMENT
๐คฌ
Exploration
"Down"
Reward ๐ฆด
Sparse rewards
Reward
Policy drives actions
Q-learning
โ Discount rate
How can we improve RL?
By using one of the oldest AI techniques*
(state machines)
*in my opinion
Agent
Environment
Reward
State
Action
Environment = modeled as
a state machine
State = finite states
(grouped by common attributes)
Reward = how well does expected state (from state machine model) reflect actual state?
Action = shortest path
to goal state
Making apps intelligent
Input
Output
๐ค LLM
Desired goal
Making apps intelligent
Input
Output
๐ค LLM
Desired goal
Making apps intelligent
Input
๐ค Reinforcement Learning
Desired goal
๐ค LLM: goal โ state

stately.ai/editor
1. Model the environment
{
state: { ... },
event: { type: 'someEvent', ... },
nextState: { ... }
}Current state
Action (event) to execute
Next state after event
1. Model the environment
Given feedback prompt, when I click good, then it takes me to submitted
Given feedback prompt, when I click bad, then it takes me to form
Given form with feedback entered, when I click submit, then it takes me to submitted

2. Determine the goal state
"Send feedback that things could have been better"{
value: 'submitted',
context: {
feedback: 'things could have been better'
}
}LLM (GPT-3)
3. Generate event data
{
type: 'feedback.update',
value: 'things could have been better'
}{ type: 'feedback.bad' }
{ type: 'feedback.submit' }No payload required (generic events)
LLM (GPT-3)
4. Find shortest path(s)



5. Execute!
How well can agent adapt to
changing environment? (simulated)
Learnings
๐ค LLMs unpredictable
โช๏ธ State machines are really useful
๐ฎ RL gives us insights for AGI
๐ Graphs make many things possible
We can make our
apps intelligent today.
Thank you Tampa Devs!

Resources
David Khourshid ยท @davidkpiano
stately.ai
TDevs State Machines and AI
By David Khourshid
TDevs State Machines and AI
- 1,656