Martin Biehl
Originally from psychology e.g. (Ryan and Deci, 2000):
activity for its inherent satisfaction rather than separable consequence
for the fun or challenge entailed rather than because of external products, pressures or reward
Examples (Oudeyer, 2008):
But can always argue:
Motivation is something that generates behavior for an agent (robot, living organism)
Working definition compatible with Oudeyer (2008):
Motivation is intrinsic if it:
This includes the approach by Schmidhuber (2010):
Motivation is intrinsic if it
rewiring agnostic means
this implies
Embodiment independent means it should work (without changes) for any form of agent:
and produce "worthwhile" behavior
Embodiment independent means it should work (without changes) for any form of agent:
Semantic free, information theoretic:
Another important but not defining feature is usually known from evolution:
Another important but not defining feature is usually known from evolution:
open endedness
The motivation should not vanish until the capacities of the agent are exhausted.
Applications of intrinsic motivations:
Developmental robotics:
Sparse reward reinforcement learning:
AGI:
Advantages of intrinsic motivations
Disadvantage:
Examples:
Examples:
Examples:
Examples:
dark room problem
Examples:
Solution for dark room problem
Remarks:
1. Generative model
2. Generative model
Model split up into three parts:
2. Generative model
Model split up into three parts:
2. Generative model
Model split up into three parts:
2. Prediction
So at \(t\) agent can plug its experience \(sa_{\prec t}\) into model
predicts consequences of \(\blue{\hat{a}_{t:\hat{T}}}\) for relations between:
2. Prediction
Call \(\text{q}(\hat{s}_{t:\hat{T}},\hat{e}_{0:\hat{T}},\theta|\hat{a}_{t:\hat{T}},sa_{\prec t},\xi)\) the complete posterior.
2. Prediction
3. Action selection
1. Free energy minimization
Actions should lead to environment states expected to have precise sensor values.
Get \(\text{q}(\hat{e}_{t:\hat{T}}|\hat{a}_{t:\hat{T}})\) frome the complete posterior:
1. Free energy minimization
2. Predictive information maximization
Actions should lead to the most complex sensor stream:
2. Predictive information maximization
2. Predictive information maximization
Georg Martius, Ralf Der
3. Knowledge seeking
Actions should lead to sensor values that tell the most about model parameters \(\Theta\):
3. Knowledge seeking
3. Knowledge seeking
Bellemare et al. (2016)
4. Empowerment maximization
Actions should lead to control over as many future experiences as possible:
4. Empowerment maximization
4. Empowerment
Guckelsberger et al. (2016)
5. Curiosity
Actions should lead to surprising environment states (sensor embeddings).
5. Curiosity
5. Curiosity
Burda et al. (2018)
References:
Aslanides, J., Leike, J., and Hutter, M. (2017). Universal Reinforcement Learning Algorithms: Survey and Experiments. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, pages 1403–1410.
Ay, N., Bertschinger, N., Der, R., Güttler, F., and Olbrich, E. (2008). Predictive Information and Explorative Behavior of Autonomous Robots. The European Physical Journal B-Condensed Matter and Complex Systems, 63(3):329–339.
Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., and Efros, A. A. (2018). Large-Scale Study of Curiosity-Driven Learning. arXiv:1808.04355 [cs, stat]. arXiv: 1808.04355.
Friston, K. J., Parr, T., and de Vries, B. (2017). The Graphical Brain: Belief Propagation and Active Inference. Network Neuroscience, 1(4):381–414.
Klyubin, A., Polani, D., and Nehaniv, C. (2005). Empowerment: A Universal Agent-Centric Measure of Control. In The 2005 IEEE Congress on Evolutionary Computation, 2005, volume 1, pages 128–135.
Orseau, L., Lattimore, T., and Hutter, M. (2013). Universal Knowledge-Seeking Agents for Stochastic Environments. In Jain, S., Munos, R., Stephan, F., and Zeugmann, T., editors, Algorithmic Learning Theory, number 8139 in Lecture Notes in Computer Science, pages 158–172. Springer Berlin Heidelberg.
Oudeyer, P.-Y. and Kaplan, F. (2008). How can we define intrinsic motivation? In Proceedings of the 8th International Conference on Epigenetic Robotics: Modeling Cognitive Development in Robotic Systems, Lund University Cognitive Studies, Lund: LUCS, Brighton. Lund University Cognitive Studies, Lund: LUCS, Brighton.
Schmidhuber, J. (2010). Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990-2010). IEEE Transactions on Autonomous Mental Development, 2(3):230–247.
Storck, J., Hochreiter, S., and Schmidhuber, J. (1995). Reinforcement Driven Information Acquisition in Non-Deterministic Environments. In Proceedings of the International Conference on Artificial Neural Networks, volume 2, pages 159–164.
Guckelsberger, C., Salge, C., & Colton, S. (2016). Intrinsically Motivated General Companion NPCs via Coupled Empowerment Maximisation. 2016 IEEE Conf. Computational Intelligence in Games (CIG’16), 150–157