russtedrake PRO
Roboticist at MIT and TRI
(Part 2)
MIT 6.421:
Robotic Manipulation
Fall 2023, Lecture 20
Follow live at
(or later at
Do Differentiable Simulators Give Better Policy Gradients?
H. J. Terry Suh and Max Simchowitz and Kaiqing Zhang and Russ Tedrake
ICML 2022
Available at:
The answer is subtle; the Heaviside example might shed some light.
Differentiable simulators give ∂θ∂f, but we want ∂θ∂Ew[f(θ,w)].
J. Burke, F. E. Curtis, A. Lewis, M. Overton, and L. Simoes, Gradient Sampling Methods for Nonsmooth Optimization, 02 2020, pp. 201–225.
But the regularity conditions aren't met in contact discontinuities, leading to a biased first-order estimator.
Often, but not always.
∂x∂f(x)=0 almost everywhere!
First-order estimator is biased
≈ ∂μ∂Eμ[f(x)]
Zero-order estimator is (still) unbiased
e.g. with stiff contact models (large gradient ⇒ high variance)
Global Planning for Contact-Rich Manipulation via
Local Smoothing of Quasi-dynamic Contact Models
Tao Pang, H. J. Terry Suh, Lujie Yang, and Russ Tedrake
Available at:
Establish equivalence between randomized smoothing and a (deterministic/differentiable) force-at-a-distance contact model.
By russtedrake
MIT Robotic Manipulation Fall 2023