Hardness of Carrots
State Collapse in Reward Symmetry
Will it learn a radially symmetric z?
Hypothesis:
1. The requirement to predict image differentiates states in radially symmetric regions.
2. But if we just predict the reward, AIS won't differentiate between the same states in the level states.
3. If the reward and dynamics are the same, there is no reason to (it's a bisimulation)
State Collapse in Reward Symmetry
Sensitivity Analysis
Take two different images, one disturbed by a very minute disturbance. What is the sensitivity of this disturbance to reward prediction?
Having a small w is necessary for predicting errors in the long term. Whether AIS learns this representation is a separate question.
Metric Comparison?
Numerical Issues with Scale
We know that AIS isn't unique, but this leads to issues with numerical scaling.
Consider two networks whose latent states are offset by a scale factor lambda. Then they have the exact same capability to predict rewards, but numerical issues kick in for arbitrarily large values of lambda.
Big Problems
Distribution Shift
State-Space Coverage
(Robust Learning)
Distributionally Robust Optimization
Data Augmentation
Regularization
Robust Control
Uncertainty Quantification
MLE
Variance Penalized MPC
Uncertainty Aware Control
Bootstrapping
Online RL
Chance-Constrained MPC
deck
By Terry Suh
deck
- 82