Active Tactile Pose and Shape Estimation of Highly Dynamic Objects

Ethan K. Gordon, Bruke Baraki, Hien Bui, Michael Posa

Supported By: NSF CAREER (FRR-2238480) and the RAI Institute

ethan@ethankgordon.com

COMPUTING INFORMATION THROUGH DYNAMICS

  • Learning Difficulty: gradients of dynamics \(f\) are near-0 or near-\(\infty\), which is not amenable to learning.
  • Solution: add inner optimization over contact forces \(\lambda\). Trade-off: solving a (fast) QP each gradient step for better gradients.

\(\mathcal{L} = -\log p(m_t | \theta, x_{<T}) =\sum_t -\log p(m_t | \theta, x_t) + ||x_t - f_\theta(x_{t-1}, \lambda)||^2\)

Where \(\lambda = \min g_\theta(x, \lambda)\)

\(\mathcal{L} = \sum_t \min_\lambda -\log p(m_t | \theta, x_t) + ||x_t - f_\theta(x_{t-1})||^2 + g_\theta(x, \lambda)\)

Physics Constrained MLE with Trajectory Optimization

Violation-Implicit (VIMP) Loss

  • Observed Info: empirical variance of log-likelihood grad

\(\mathcal{I} = \sum_{m_t} \left(\nabla_{\theta, x_T}\log p(m_t|\theta, x_T)\right)^2\)

  • Fisher Info: expected variance given future measurements

\(\mathcal{F} = \mathbb{E}_{m_t}\left[\mathcal{I}(m_t, \theta, x_{H>T})\right]\)

  • Sample future actions, Simulate to estimate \(\mathcal{F}\), then maximize Expected Information Gain (EIG)

\(EIG := \log\det\left(\mathcal{F}\mathcal{I}^{-1} + \mathbf{I}\right)\)

w.r.t. Current Time T

w.r.t. Future Time H

Challenge: How to compute \(\nabla \log p(m_t | x_T)\), i.e. sensitivity of past measurements to the future state?

Computing \(g=\nabla \log p(m_t | x_T)\)

Rejected: Backwards Simulation

\(g = \nabla \log p(m_t|x_t(x_T))\)

Ill-defined for frictional contact.

Baseline 1: Identity Jacobian

\(\nabla_{x_T}x_t = \nabla_{x_t}x_T = \mathbb{I}\)

Treats object as quasi-static

Baseline 2: Diffsim

Compute \(g=\nabla \log p (m_t|x_0)\) instead.

Poorly conditioned numerically

Proposed: Marginalize + Sample

 \(g \approx \nabla \log \sum_{x_t}p(m_t|x_t)p(x_t|x_T)\) 

Sample \(x_t\) w/ MCMC, use vimp loss \(\mathcal{L}\)

\(\approx softmax_{x_t}(\log p(m_t|x_t)) \cdot \nabla\mathcal{L}\) 

EXPERIMENT SETUP

Hardware Setup

  • Modified Trifinger Robot
  • Densetact 2.0 (Do, 2023)
  • Contact Boolean \(c_t\) and Normal \(\hat{n}_t\) computed with Optical Flow and a Helmholz-Hodge Decomposition
  • FoundationPose for Ground-Truth only

 

Experiment Design

  • Action: 1s motion towards estimated object centroid
  • Random Baseline: Randomize approach direction
  • EIG (Ours): Choose the approach to maximize EIG, sampling-based optimization with Gaussian Cross-Entropy Method

 

Project Website

BASELINE RESULTS AND CONCLUSION

  • Simulated results for cuboid and polytope parameterizations.
  • Real robot results on cuboid
  • Example learning curves. Observed information increases with more data.
  • Bidirectional Chamfer Distance (bCH) Evaluation Metric
  • Identity Jacobian EIG showed significant improvement over object-directed actions from random approach directions.

 

Next Steps: expand experiment to proposed (marginalize + sample) EIG formulation.

 

INTRODUCTION

  • Learning a physical model online can improve data efficiency, predictability, and reuse between tasks.
  • Previous Tactile System Identification Work: Static Objects and/or Strong Shape Priors and/or 2D
  • Learning: apply ContactNets-style violation-implicit learning (Pfommer, CoRL 2020) to pose estimation, avoiding the numerical stiffness inherent in rigid-body contact dynamics.
  • We learn cuboid and convex polyhedra with less than 10s of randomly collected data.
  • Exploration: maximizing Expected Info Gain (EIG) leads to significantly faster learning.

 

Challenge: Use only tactile data to find the pose and geometry of an arbitrary dynamic convex object.

Fast sampling with learned trajectory

Active Tactile Pose and Shape Estimation of Highly Dynamic Objects

Ethan K. Gordon, Bruke Baraki, Hien Bui, Michael Posa

Supported By: NSF CAREER (FRR-2238480) and the RAI Institute

ethan@ethankgordon.com

METHOD OVERVIEW

  • Learning Difficulty: gradients of dynamics \(f\) are near-0 or near-\(\infty\), which is not amenable to learning.
  • Solution: add inner optimization over contact forces \(\lambda\). Trade-off: solving a (fast) QP each gradient step for better gradients.

\(-\log p(m_t | \theta, x_{<T}) =\sum_t -\log p(m_t | \theta, x_t) + ||x_t - f_\theta(x_{t-1}, \lambda)||^2\) s.t. \(\lambda = \min g_\theta(x, \lambda)\)

\(\mathcal{L} = \sum_t \min_\lambda -\log p(m_t | \theta, x_t) + ||x_t - f_\theta(x_{t-1})||^2 + g_\theta(x, \lambda)\)

Physics Constrained MLE with Trajectory Optimization

Violation-Implicit (VIMP) Loss

  • Observed Info: empirical variance of log-likelihood grad

\(\mathcal{I} = \sum_{m_t} \left(\nabla_{\theta, x_T}\log p(m_t|\theta, x_T)\right)^2\)

  • Fisher Info: expected variance given future measurements

\(\mathcal{F} = \mathbb{E}_{m_t}\left[\mathcal{I}(m_t, \theta, x_{H>T})\right]\)

  • Sample future actions, Simulate to estimate \(\mathcal{F}\), then maximize Expected Information Gain (EIG)

\(EIG := \log\det\left(\mathcal{F}\mathcal{I}^{-1} + \mathbf{I}\right)\)

w.r.t. Current Time T

w.r.t. Future Time H

Challenge: How to compute \(g=\nabla \log p(m_t | x_T)\), i.e. sensitivity of past measurements to the future state?

Rejected: Backwards Simulation

\(g=\nabla \log p(m_t|x_t(x_T))\)

Ill-defined for frictional contact.

Baseline 1: Identity Jacobian \(\nabla_{x_T}x_t = \nabla_{x_t}x_T = \mathbb{I}\)

Treats object as quasi-static

Fails for left example: \(\mathcal{F}\) only non-0 for 1 face.

Baseline 2: Differentiable Simulation

Compute \(g=\nabla \log p (m_t|x_0)\) instead.

Poorly conditioned numerically

Same as learning difficulty (above).

Proposed: Marginalize + Sample

 \(g \approx \nabla \log \sum_{x_t}p(m_t|x_t)p(x_t|x_T)\)

\(\approx softmax_{x_t}(\log p(m_t|x_t)) \cdot \nabla\mathcal{L}\) 

Sample \(x_t\) w/ MCMC, use vimp loss \(\mathcal{L}\)

COMPUTING INFORMATION THROUGH DYNAMICS

  • Trifinger with Densetact 2.0 (Do, 2023)
  • Contact Boolean \(c_t\) and Normal \(\hat{n}_t\) computed with Optical Flow and a Helmholz-Hodge Decomposition
  • FoundationPose for Ground-Truth only

 

Project Website

BASELINE EXPERIMENT AND CONCLUSION

  • (Left): Simulated (top) and real (bottom) results for cuboid and polytope parameterizations.
  • Identity Jacobian EIG showed significant improvement over object-directed actions from random approach directions.

 

Next Steps: expand experiment to proposed (marginalize + sample) EIG formulation.

 

INTRODUCTION

  • Learning a model online can improve data efficiency, predictability, and reuse between tasks.
  • Previous Tactile System Identification Work: Static Objects and/or Strong Shape Priors and/or 2D
  • Learning: apply ContactNets-style violation-implicit learning (Pfommer, CoRL 2020) to pose estimation, avoiding the numerical stiffness inherent in rigid-body contact dynamics.
  • We learn cuboid and convex polyhedra with less than 10s of randomly collected data.
  • Exploration: maximizing Expected Info Gain (EIG) leads to significantly faster learning.

 

Challenge: Use only tactile data to find the pose and geometry of an arbitrary dynamic convex object.

Fast sampling with learned trajectory

Example (right): We have only touched 1 face, but the dynamics provides info about 3 faces.

\(m_1\)

\(x_1\)

\(m_T\)

\(x_T\)

\(m_1\)

\(x_1\)

\(m_2\)

\(x_2\)