Martin Biehl (Cross Labs)
Nathaniel Virgo (Earth-Life Science Institute)
Made possible via funding by:
Then could
Consider designing an artificial agent where you know
Then want to find
We show how
Then can (often but not always) formally represent this in Bayesian network with
To find unknown kernels \(p_U:=\{p_u: u \in U\}\)
[*] Matthew Botvinick and Marc Toussaint. Planning as inference. Trends in cognitive sciences, 16(10):485–488, 2012.
Two options:
Then result of planning as inference will
Two situations:
Explicitly reflect either by
Then
If we have found the agent that solves the problem
How to fix memory dynamics \(p_M(m_t|s_t,m_{t-1})\) such that it has consistent Bayesian interpretation w.r.t chosen model
in 2-armed bandit:
Then can make uncertainty explicit:
Then by construction
Note:
Underlying perspective:
Underlying perspective:
Underlying perspective:
\(\Rightarrow\) if we formulate agent design problems as planning problems they become inference problems
Underlying perspective:
design as planning \(\to\) planning as inference
\(\Rightarrow\) design as inference?
Assume
Then
Formalize as POMDP:
Terminology:
What is it good for?
Automatically find a probabilistic policy to achieve a goal.
What do you need to use it?
Combination:
\(\Rightarrow\) can use max. likelihood to solve planning!
Given:
Find parameter \(\phi^*\) that maximizes likelihood of the observations:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)\]
Example: Maximum likelihood inference of coin bias
Then:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)=\frac{c_{\text{heads}}(\bar x)}{c_{\text{heads}}(\bar x)+c_{\text{tails}}(\bar x)}\]
Example: Maximum likelihood inference of coin bias
Find parameter \(\phi^*\) that maximizes likelihood of the observations:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)\]
Example: Maximum likelihood inference of coin bias
Find parameter \(\phi^*\) that maximizes likelihood of the observations:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)\]
Example: Maximum likelihood inference of coin bias
Find parameter \(\phi^*\) that maximizes likelihood of the observations:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)\]
Note that for maximum lik
So we can use :
Bayesian network
goal
Bayesian network
goal
policies
Practical side of original framework:
Bayesian network
goal
policies
Bayesian network
goal
policies
Bayesian network
goal
policies
Bayesian network
goal
policies
Multiple, possibly competing goals
Coordination and communication from an information theoretic perspective
Dynamic scalability of multi-agent systems
Dynamically changing goals that depend on knowledge acquired through observations
Example multi agent setups:
Two agents interacting with same environment
Two agents with same goal
Two agents with different goals
Example non-cooperative game: matching pennies
Example non-cooperative game: matching pennies
joint pdists \(p(a_1,a_2)\)
disjoint goal manifolds
agent manifold
\(p(a_1,a_2)=p(a_1)p(a_2)\)
EM
EM
Text
Text
2. Planning to learn / uncertain MDP, bandit example.
if x=1
if x=1
But for adding and removing agents probably needed
Thank you for your attention!