Martin Biehl (Cross Labs)
Nathaniel Virgo (Earth-Life Science Institute)
Made possible via funding by:
When we design artificial agents we usually know something about
Then want to find
We show how
Formally represent what we know and what we don't know via Bayesian network with
Find unknown kernels \(p_U:=\{p_u: u \in U\}\)
[*] Matthew Botvinick and Marc Toussaint. Planning as inference. Trends in cognitive sciences, 16(10):485–488, 2012.
[*] Matthew Botvinick and Marc Toussaint. Planning as inference. Trends in cognitive sciences, 16(10):485–488, 2012.
Two situations:
Explicitly reflect this by
Then
Can also make another kind of uncertainty explicit
Then by construction
Note:
Underlying perspective:
Underlying perspective:
Underlying perspective:
\(\Rightarrow\) if we formulate agent design problems as planning problems they become inference problems
Underlying perspective:
design as planning \(\to\) planning as inference
\(\Rightarrow\) design as inference?
Assume
Then
Formalize as POMDP:
Terminology:
What is it good for?
Automatically find a probabilistic policy to achieve a goal.
What do you need to use it?
Combination:
\(\Rightarrow\) can use max. likelihood to solve planning!
Given:
Find parameter \(\phi^*\) that maximizes likelihood of the observations:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)\]
Example: Maximum likelihood inference of coin bias
Then:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)=\frac{c_{\text{heads}}(\bar x)}{c_{\text{heads}}(\bar x)+c_{\text{tails}}(\bar x)}\]
Example: Maximum likelihood inference of coin bias
Find parameter \(\phi^*\) that maximizes likelihood of the observations:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)\]
Example: Maximum likelihood inference of coin bias
Find parameter \(\phi^*\) that maximizes likelihood of the observations:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)\]
Example: Maximum likelihood inference of coin bias
Find parameter \(\phi^*\) that maximizes likelihood of the observations:
\[\phi^*=\text{arg} \max_\phi p_\phi(\bar x)\]
Note that for maximum lik
So we can use :
Bayesian network
goal
Bayesian network
goal
policies
Practical side of original framework:
Bayesian network
goal
policies
Bayesian network
goal
policies
Bayesian network
goal
policies
Bayesian network
goal
policies
Multiple, possibly competing goals
Coordination and communication from an information theoretic perspective
Dynamic scalability of multi-agent systems
Dynamically changing goals that depend on knowledge acquired through observations
Example multi agent setups:
Two agents interacting with same environment
Two agents with same goal
Two agents with different goals
Example non-cooperative game: matching pennies
Example non-cooperative game: matching pennies
joint pdists \(p(a_1,a_2)\)
disjoint goal manifolds
agent manifold
\(p(a_1,a_2)=p(a_1)p(a_2)\)
EM
EM
Text
Text
2. Planning to learn / uncertain MDP, bandit example.
if x=1
if x=1
But for adding and removing agents probably needed
Thank you for your attention!