Active Tactile Exploration for Rigid Body Pose and Shape Estimation

Ethan K. Gordon, Bruke Baraki, Hien Bui, Michael Posa

Supported By: NSF CAREER (FRR-2238480) and the RAI Institute

ethan@ethankgordon.com

INTRODUCTION

Only tactile data is used to find the pose and geometry of an arbitrary dynamic convex object.

Project Website:

dairlab.github.io/activetactile

COMPUTING INFORMATION THROUGH DYNAMICS

RESULTS AND CONCLUSION

  • Learning a physical model online can improve data efficiency, predictability, and reuse between tasks.
  • Previous Tactile System Identification Work: Static Objects and/or Strong Shape Priors and/or 2D
  • Learning: apply ContactNets-style violation-implicit learning (Pfommer, CoRL 2020) to pose estimation, avoiding the numerical stiffness inherent in rigid-body contact dynamics.
  • We learn cuboid and convex polyhedra with less than 10s of randomly collected data.
  • Exploration: maximizing Expected Info Gain (EIG) leads to significantly faster learning.

 

Challenge: Use only tactile data to find the pose and geometry of an arbitrary dynamic convex object.

EXPERIMENT SETUP

  • Modified Trifinger Robot
  • Densetact 2.0 (Do, 2023)
  • Contact Boolean \(c_t\) and Normal \(\hat{n}_t\) are computed with Optical Flow and a Helmholz-Hodge Decomposition
  • FoundationPose for Ground-Truth only

 

  • Each action: 1s motion towards estimated object state
  • Random Baseline: Randomize approach direction
  • EIG (Ours): Choose the approach to maximize EIG, sampling-based optimization with Gaussian Cross-Entropy Method

 

  • Bidirectional Chamfer Distance (bCH) Evaluation Metric
  • Simulated results for cuboid and polytope parameterizations.

 

  • (Above) Example learning curves. Observed information increases with more data.
  • (Below) Real robot results

 

Conclusion

  • Identity Jacobian EIG showed significant improvement over object-directed actions from random approach directions.
  • Next Steps: expand experiment to future work (marginalize + sample) EIG formulation.

 

  • Difficulty: gradients of dynamics \(f\) are near-0 or near-\(\infty\), which is not amenable to learning.
  • Solution: add inner optimization over contact forces \(\lambda\). Trade-off: solving a (fast) QP each gradient step for better gradients.

\(\mathcal{L} = -\log p(m_t | \theta, x_{0\ldots T}) =\sum_t -\log p(m_t | \theta, x_t) + ||x_t - f_\theta(x_{t-1}, \lambda)||^2\)

Where \(\lambda = \min g_\theta(x, \lambda)\)

\(\mathcal{L} = \sum_t \min_\lambda -\log p(m_t | \theta, x_t) + ||x_t - f_\theta(x_{t-1})||^2 + g_\theta(x, \lambda)\)

Physics Constrained MLE with Trajectory Optimization:

Violation-Implicit (VIMP) Loss:

  • Observed Info: empirical variance of log-likelihood grad

\(\mathcal{I} = \sum_{m_t} \left(\nabla_{\theta, x_T}\log p(m_t|\theta, x_T)\right)^2\)

  • Fisher Info: expected variance given future measurements

\(\mathcal{F} = \mathbb{E}_{m_t}\left[\mathcal{I}(m_t, \theta, x_{H>T})\right]\)

  • Sample future actions, Simulate to estimate \(\mathcal{F}\), then maximize Expected Information Gain (EIG)

\(EIG := \log\det\left(\mathcal{F}\mathcal{I}^{-1} + \mathbf{I}\right)\)

w.r.t. Current Time T

w.r.t. Future Time H

Challenge: Compute \(\nabla \log p(m_t | x_T)\), i.e. sensitivity of past measurements to future states

Rejected: Backwards Simulation

\(\ldots =\nabla \log p(m_t|x_t(x_T))\)

Ill-defined for frictional contact.

Ours: Identity Jacobian

\(\nabla_{x_T}x_t = \nabla_{x_t}x_T = \mathbb{I}\)

Note: Treats object as quasi-static

Marginalize + Sample

 \(\ldots \approx \nabla \log \sum_{x_t}p(m_t|x_t)p(x_t|x_T)\) 

Sample \(x_t\) with MCMC

Use vimp loss \(\mathcal{L}\)

\(\approx softmax_{x_t}(\log p(m_t|x_t)) \cdot \nabla\mathcal{L}\) 

Fast sampling with learned trajectory

Future Work