"Empowerment" as an Intrinsic Motivation to Explore Sparse Environments

  • Empowerment is defined to be a measure of the amount of control an agent has over it its future environment.

 

  • i.e. how much the agent's actions are able to influence the states it can potentially reach in the future
  • Formally, empowerment is defined as the channel capacity between an agent's set of possible action sequences and the agent's environment state at a time in the future:
  • Basically quantifies how much influence an agent's actions have over the possible future states that it is able to reach.

 

  • Can also think of it as measuring how much information an agent can inject into its environment (through its actions), that it is able to itself detect at a later time.

How to calculate empowerment

  • Not so easy - mutual information is a difficult quantity to calculate

 

  • Have to choose a probability distribution over all possible sequences of actions to maximise this mutual information!

 

  • Basically - very difficult to do this exactly for anything but the simplest toy systems, especially for large n.

Variational Bound on Mutual Information (Mohamed and Rezende, 2015 - also before I think)

Can represent q (and w, the policy) by neural networks and maximise this! Mohamed and Rezende do this by alternating between maximising w.r.t q (maximum likelihood) and w (they derive an expression for the functional derivative of the above) - using SGD in both cases.

  • Note: distribution w is not (necessarily) the policy you want to follow. It is just a policy which is needed for calculating the empowerment. A policy which moves to states that maximise empowerment is completely different.
  • But w has some potentially useful properties - (roughly) it should uniformly explore all states that are accessible n steps into the future.
Made with Slides.com