October 1, 2025
Dexterity Now: Low-Data Learning and Contact-rich MPC
Michael Posa
University of Pennsylvania


Decades of model-based manipulation

Lozano Perez, 1976

Chavan Dafle et al., 2014

Vina et al., 2015

Trinkle and Paul, 1990

Lynch and Mason, 1996
And yet, we felt so far from real dexterity
A sea change

OpenAI, 2019
Sim-to-real RL

Chi et al., 2023
Imitation learning

Dyna Robotics, 2025












- Observe scene, passively or actively
-
Take actions toward some goal
- ...use action-conditioned predictions
- with structured representations
A generic recipe for novelty?
What's so different about manipulation?

Apollo GNC architecture
Contact-rich dynamics are a big problem
- Hybrid models are hard to learn/identify
- Multi-modality challenges control and planning
- Policies are sensitive to model and state estimation error
Some steps toward contact-rich learning and control
Contact-rich
model learning

High-performance
hybrid MPC

How much data should we need?
Vision only [Wen, BundleSDF]
Vision + Physics = Vysics
Vision-Based
Visible Geometry
Object Poses
Masked RGBD video
Tracking and
Reconstruction
(BundleSDF)
Bianchini*, Zhu*, et al. "Vysics: Object Reconstruction Under Occlusion by Fusing Vision and Contact-Rich Physics." RSS. 2025.
Bibit Bianchini

Minghan Zhu

Physics-Based
“Physible” Geometry
Robot Proprioception
Model Learning
Inertia

Integrated Geometry
Object URDF

Physics-based model learning
Robot Proprioception
“Physible” Geometry
Inertia
Physics-Based
Model Learning
Object Poses
\(x_1, x_2, ..., x_n\)
\(\theta\)
\(\min_\theta \mathcal L(\theta, x_t, x_{t+1})\)
Physics-based model learning
-
A common way: differentiable simulation
- \(\mathcal{L} (\theta, x_t, x_{t+1}) = \|x_{t+1}-f_\theta(x_t)\|^2 \)
- \( \theta \leftarrow \theta - \nabla_\theta \mathcal L (\theta, x_t, x_{t+1}) \)
- Contact dynamics:$$\begin{align*}\mathcal{L} (\theta, x_t, x_{t+1}) =& \|x_{t+1}-f_\theta(x_t, \lambda_t)\|^2 \\ \text{s.t.} \quad &g_\theta(x_t, \lambda_t)=0\end{align*}$$
Robot Proprioception
“Physible” Geometry
Inertia
Physics-Based
Model Learning
Object Poses
\(x_1, x_2, ..., x_n\)
\(\theta\)
\(\min_\theta \mathcal L(\theta, x_t, x_{t+1})\)
- \(\lambda\): contact impulse
- \(f_\theta(x, \lambda) \): forward dynamics
- \( g_\theta(x, \lambda) \): contact-related constraints
- e.g., contact complementarity: \(\lambda\phi_{\theta}(x)=0\) (\(\phi\) is signed distance)
- Problem: the rigidity makes \(\lambda\) sensitive to \(\theta\) and makes \(\mathcal L\) steep.
Problem with differentiable simulation in contact-rich dynamics



- Differentiable simulation: \( \theta \leftarrow \theta - \nabla_\theta \mathcal L (\theta, x_t, x_{t+1}) \)
-
Contact dynamics:
$$\begin{align*}\mathcal{L} (\theta, x_t, x_{t+1}) =& \|x_{t+1}-f_\theta(x_t, \lambda_t)\|^2 \\ \text{s.t.} \quad &g_\theta(x_t, \lambda_t)=0\end{align*}$$
- Problem: the rigidity makes \(\lambda\) sensitive to \(\theta\) and makes \(\mathcal L\) steep.
- Softening the constraint leads to inaccurate dynamics.
Desiderata
- Learn predictive models from seconds of data,
- Do not require labeling of contact events or modes,
- Reliable training,
- Generalizable (as much as possible).
Avoiding simulation
Mathew Halm

Sam Pfrommer

Learned parameters \(\Theta\)
(geometry, friction, inertia, etc.)
Curr. state
\(x\)
Simulator
(optimization)
[Pfrommer*, Halm*, and P. ContactNets: Learning Discontinuous Contact Dynamics with Smooth, Implicit Representations. CORL, 2020.]
Baseline: prediction/diff. sim
Next state
Prediction
\(f_\theta(x)\)
Simulated
contact force
\(\lambda\)
Next state
\(x'\)
Loss
(comparison)
Curr. state
\(x\)
Next state
$$x'$$
ContactNets
Learned parameters \(\theta\)
(geometry, friction, inertia, etc.)
Realism Measures
(optimization)
Explaining
contact force
\(\lambda\)
Loss
(realism)
ContactNets: Contact-Implicit Optimization
- Turn the contact-related constraints \(g\) into a penalty.


- Try to explain the observed next state \(x_{t+1}\) when solving for \( \lambda_t \).
- Smooth the loss function without changing the optimal solution.
[Pfrommer, Halm, and P. Contactnets: Learning discontinuous contact dynamics with smooth, implicit representations. CoRL. 2020.]
[Bianchini, Halm, Matni, and P. Generalization bounded implicit learning of nearly discontinuous functions. L4DC, 2022.]
[Bianchini, Halm, and P. Simultaneous learning of contact and continuous dynamics. CoRL, 2023.]

Mathew Halm

Bibit Bianchini

Sam Pfrommer

\(\mathcal{L} (\theta, x_t, x_{t+1}) = \min_\lambda ( \|x_{t+1}-f_\theta(x_t, \lambda)\|^2 + \frac{1}{\epsilon} g_\theta(x_t, \lambda_t) ) \)
Mathew Halm

Contact dynamics:
$$\begin{align*}\mathcal{L} (\theta, x_t, x_{t+1}) =& \|x_{t+1}-f_\theta(x_t, \lambda_t)\|^2 \\ \text{s.t.} \quad &g_\theta(x_t, \lambda_t)=0\end{align*}$$

\(\theta\)
\(x\)
The process of model learning
- Sample the shape \(\{(\mathbf n_i, s_i)\}_i\)
- Inner loop: solve for \(\lambda_i\)'s
- Outer loop: gradient descent on \(\theta\)





\(\mathcal{L} (\theta, x, x') = \min_\lambda ( \|x'-f_\theta(x, \lambda)\|^2 + \frac{1}{\epsilon} g_\theta(x, \lambda) ) \)
\(\lambda\phi_{\theta}(x) + ...\)
(observed motion)




Experiments
- Robot interacts with objects using a ball-shape end-effector.
- A single realsense camera fixed on the side.
- Input: RGBD video (~10s) and robot proprioception. No tactile sensors.
- The object is partially occluded throughout the video.
Pushing
Toppling
Pivoting


Bibit Bianchini

Minghan Zhu

| Method | bakingbox | bottle | egg | milk | oatly | styro. | toble. | all |
|---|---|---|---|---|---|---|---|---|
| BundleSDF | 3.84 | 2.65 | 3.70 | 3.17 | 2.45 | 2.55 | 2.44 | 2.98 |
| 3DSGrasp | 3.83 | 2.80 | 3.78 | 3.15 | 2.51 | 2.66 | 2.77 | 3.06 |
| IPoD | 3.25 | 1.80 | 2.16 | 2.37 | 2.73 | 1.93 | 1.97 | 2.47 |
| V-PRISM | 3.52 | 2.47 | 2.31 | 3.33 | 2.30 | 2.54 | 2.48 | 2.80 |
| OctMAE | 3.11 | 2.22 | 1.52 | 2.93 | 2.13 | 2.00 | 2.36 | 2.45 |
| Vysics (ours) | 1.83 | 1.36 | 1.05 | 1.53 | 1.25 | 1.45 | 1.02 | 1.45 |
Vision, Physics, and Generative AI
[Zhu, Wang, Sun, Ghaffari, and P. Object Reconstruction under Occlusion with Generative Prior and Contact-induced Constraints. Under review.]
Minghan Zhu

Active Tactile Exploration
[Gordon, Baraki, Bui, and P. Active Tactile Exploration for Rigid Body Pose and Shape Estimation. Under review.]
Ethan Gordon


Choose:
-
Robot Trajectory \(r[t]\)
Measure:
-
Contact Boolean \(c_t\)
-
Contact Normal \(\hat{n}_t\)
-
Proprioception
Find:
-
Object Geometry \(\theta^*\)
-
Object Pose \(x^*_T\)
Information maximization
[Gordon, Baraki, Bui, and P. Active Tactile Exploration for Rigid Body Pose and Shape Estimation. Under review.]
Ethan Gordon

Observed/Expected Information
Learn; Compute
Observed Info \(\mathcal{I}\)
Sample + Simulate
Expected Fisher Info \(\mathcal{F}\)
\(\max EIG := \log\det\left(\mathcal{F}\mathcal{I}^{-1} + \mathbf{I}\right)\)
Choose actions where simulated, expected Fisher info is distinct from Observed info.




2X
High-performance
hybrid MPC

Contact-rich MPC
Desiderata
- Online decision making for novel tasks,
- Autonomous, non-trivial mode selection and timing,
- Naturally expression of task objectives
Real-time control to simultaneously plan continuous motions and contact schedules

Contact-implicit MPC
Goal:"LQR" but for multi-contact systems
- A linear complementarity system (LCS) is piecewise-affine $$\begin{align*} &x_{k+1} = Ax_k + Bu_k + D \lambda_k + d\\ &0 \leq \lambda_k \perp Ex_k + H u_k + F \lambda_k + c\geq 0 \end{align*}$$ and locally approximates simulator behavior
- Linearizes non-contact dynamics, geometry, and kinematic Jacobian
[Aydinoglu and P. Real-time multi-contact model predictive control via admm ICRA, 2022. Award finalist.]
[Aydinoglu, Wei, Huang, and P. Consensus Complementarity Control for Multi-Contact MPC . TRO, 2024.]
[Bui*, Gao*, Yang*, et al. Push Anything: Single- and Multi-Object Pushing From First Sight with Contact-Implicit MPC. Under review.]
MPC problem
$$\begin{align*} \min_{[x,u,\lambda]_i} \quad & \left[\sum_i^{N-1} x_i^T Q x_i + u_i^T R u_i\right] + x_N^T Q_f x_N \\ \text{s.t.}\quad & x_{i+1} = Ax_i + Bu_i {\color{highlight}+ D \lambda_i + d} \\ &{\color{highlight} 0 \leq \lambda_i \perp Dx_i + Eu_i + F \lambda_i + c\geq 0} \end{align*}$$
Quadratically-constrained QP (or MIQP)
Equivalent problem
$$\begin{align*} \min_{[x,u,\lambda, \eta]_i} \quad & \left[\sum_i^{N-1} x_i^T Q x_i + u_i^T R u_i\right] + x_N^T Q_f x_N \\ \text{s.t.}\quad & x_{i+1} = Ax_i + Bu_i {\color{highlight}+ D \lambda_i + d} \\ & {\color{highlight}\eta_i = Dx_i + Eu_i + F \lambda_i + c} \\ & {\color{highlight}0 \leq \lambda_i \perp \eta_i \geq 0} \end{align*}$$
Consensus Complementarity Control Plus (C3+) splits constraints w/ADMM
-
Dynamics constraints: QP
Hypothesizes beneficial, but non-physical, forces \(\lambda\) -
Complementarity constraints: nonconvex
Constrains forces to be physical, but violates step-to-step dynamics
By virtue of reformulation, all complementarity constraints are decoupled - ADMM iterations push toward agreement
Step 1 Constraints
$$\begin{align*} x_{i+1} = Ax_i + Bu_i {\color{highlight}+ D \lambda_i + d} \\ {\color{highlight}\eta_i = Dx_i + Eu_i + F \lambda_i + c} \end{align*}$$
Step 2 Constraints
$$\begin{align*} {\color{highlight}0 \leq \lambda_i \perp \eta_i \geq 0} \end{align*}$$
Every problem specifies only a natural objective (distance to goal), but MPC determines contact
[2x]
[10x]
[1x]
[10x]
[1x]
[Aydinoglu and P. Real-time multi-contact model predictive control via admm ICRA, 2022. Award finalist.]
[Aydinoglu, Wei, Huang, and P. Consensus Complementarity Control for Multi-Contact MPC . TRO, 2024.]
[Yang and P. Dynamic On-Palm Manipulation via Controlled Sliding. RSS, 2024. Outstanding Student Paper Award.]
Dynamic sliding and forceful dexterity


William Yang

"Hybrid" control
Sharanya Venkatesh

Bibit Bianchini


Linear complementarity is hybrid, but still local

Local, gradient-based reasoning can make some, but not all, hybrid decisions

Thousands of modes
Few modes, but unavoidably discrete
[Venkatesh*, Bianchini*, Aydinoglu, Yang, and P. Approximately Global Contact-Implicit MPC via Sampling and Local Complementarity. Under review.]
Bi-level hybrid reasoning
[Venkatesh*, Bianchini*, Aydinoglu, Yang, and P. Approximately Global Contact-Implicit MPC via Sampling and Local Complementarity. Under review.]
Sharanya Venkatesh

Bibit Bianchini

$$\begin{align*} \min_{[x, u]_i} \quad & \sum_{i=0}^{N_1-1} \text{cost before contact}(x_i, u_i) + \sum_{i=N_1}^{N} \text{cost after contact}(x_i, u_i) \end{align*}$$
- Approximate first stage as kinematic end-effector repositioning
- Solve second via local, contact-rich MPC (C3)


Reliable and precise real-time control that repeatedly achieves arbitrary pose targets given only a 3D object model
40x
[Bui*, Gao*, Yang*, et al. Push Anything: Single- and Multi-Object Pushing From First Sight with Contact-Implicit MPC. Under review.]
Hien Bui

Yufeiyang Gao

Haoran Yang

Simultaneously plan 19 possible frictional contacts, \(5^{19} \approx 19\)T modes
- 12 object-ground
- 6 object-object
- 1 robot-object
- 7-step, ~0.5s horizon
- 3 ADMM iterations
- 9 Hz control rate

Push Anything
object-ground
object-object
Planned Forces
end effector-object
1x
The controller plans to pivot the Letter S using contact with the book.
- 25 objects
- 700/701 success
- Avg. time to goal: 31 sec
10x
2 objects
- 100/102
- Avg. 1.6 min
15x
3 objects at a time
20x
4 objects at a time



Closing thoughts
Towards pretty good, quickly
-
Contact-rich control has come a long way in
the last few years- Fast, approximate reasoning
- Models can be imperfect/adapted online
- Multi-fingered demos coming soon
-
Online reasoning shouldn't be tabula rasa.
- Vision and language
- Physics
- Planning and control architectures
- Pre-trained policies, value functions, etc.
Challenge: Seek out challenges where you have unique insight to contribute
Utah Seminar 2025
By Michael Posa
Utah Seminar 2025
- 43