- Training neural networks is an art and a science.
- Recurrent neural networks learn temporal patterns
- Reinforcement learning learns control through interactions with the environment.
- Companies have open sourced software and hardware to train more deep learning experts.

*(Press DOWN to explore or RIGHT to skip)*

Regularization techniques prevent overfitting (excelling at test examples but failing on new examples).

- Adversarial (fool the model that is being trained)
- Noise-based (dropout, gradient noise)
- Miscellaneous (batch normalization, gradient clipping)

Leon Gatys et al. Neural Algorithm of Artistic Style. 2015

- RNNs generalize and surpass HMMs at learning sequences (e.g. speech recognition)
- LSTM and GRU architectures incorporate a concept of memory (deciding what to remember and forget)
- Might be useful for early warning systems

Control is learning how to interact with the environment to reach goals (e.g. robotics).

Volodymyr Mnih et al. Human-level Control through Deep Reinforcement Learning. *Nature* 518, 2015.

Volodymyr Mnih et al. Playing Atari with Deep Reinforcement Learning. NIPS Workshop, 2013.

Lukasz Kaiser, Ilya Sutskever. Neural GPUs learn algorithms. 2015.

- Sometimes it is important to be able to interpret and understand the rules learned by our model.
- Probabilistic programming is an executable graphical model that performs inference through simulations.
- Bayesian nonparametrics are an expressive alternative to deep learning.

*(Press DOWN to explore or RIGHT to skip)*

Rich Caruana. Accuracy on the test set is not enough: The risk of deploying unintelligible models in healthcare. NIPS Workshop, 2015.

- In healthcare, a model can learn dangerous rules that must be screened by an expert. For these cases, purely predictive black box approaches (e.g. deep learning) are unsuitable because it is important to be able to understand what the model learned.

Robert Tibshirani. Some Recent Advances in Post-selection inference. Breiman Invited lecture, NIPS 2015.

Non-parametric Bayesian methods are an alternative to deep learning that generates models that are amenable to interpretation, which could prove useful in science.

MCMC estimates integrals over distributions. Metropolis-Hastings is popular for sampling from high-dimensional distributions.

Zoubin Ghahramani. "Probabilistic machine learning and artificial intelligence." *Nature* 521, 2015.