Tractable Adaptability

Ethan K. Gordon

Postdoc, University of Pennsylvania

PhD 2023, University of Washington

Active Learning for Contact-Rich Assistive Manipulation

The Promise of Physical HRI

“For a long time, I would only let my mom feed me. I wondered, why am I so uncomfortable with others feeding me that I’ll just not eat? I realized that eating is so individualized, with so many intricacies. If I can have a robot do it, I can learn to adapt to it, but it would be me feeding me, and that would be huge”

 

Tyler Schrenk

1985-2023

What is needed for pHRI?

Contact-Rich Manipulation

  • Sliding to clean the spoon and bowl
  • Shaking to smoothen
  • In-Mouth Hand-Off
    (vision-denied)

 

Online Adaptation

  • Bite Size Adjustment

What is needed for pHRI?

Online Adaptation

  • Totally Different Food
  • Multi-bite: different shapes for each bite

 

There is no time for

re-training!

Tractable Adaptability

How can robots efficiently learn, during deployment, how to manipulate previously-unseen objects?

The Technology/Application Cycle

Support

Inform

(Gordon, Under Review)

Active Learning in Contact

Assistive Robotics

(Gordon, CoRL 2023)
(Feng, ISRR 2019)
(Gordon, IROS 2020)
(Gordon, ICRA 2021)
(Nanavati, HRI 2025)
(Bhattacharjee, HRI 2020)
(Nanavati, HRI 2024)
(Gordon, HRI Companion 2024)

The Technology/Application Cycle

Support

Inform

Active Learning in Contact

Assistive Robotics

Policy Space Simplification

Leveraging Haptics

(Nanavati, HRI 2025)
(Bhattacharjee, HRI 2020)
(Nanavati, HRI 2024)
(Gordon, HRI Companion 2024)

Problem: Bite Acquisition

Online Bite Acquisition Challenges

  • Large Action Space (whole trajectory) OR Sparse Reward
  • Unknown Dynamics / State Transition: food simulation is hard!

Solution: Data-Driven Policy-Space Simplification

(Gordon, CoRL 2023)

Learn This Online!

Data-Driven Discretization: Emergent Behavior

Wiggling

Tilting

High Pressure

Scooping

Online Bite Acquisition Challenges

  • High-Dimension State Space: Foods are really diverse!

Solution: Haptic Policy Regularization

(Bhattacharjee, R-AL 2019); (Gordon, IROS 2020); (Gordon, ICRA 2021)

Optimize Jointly

The Technology/Application Cycle

Support

Inform

(Nanavati, HRI 2025)

Active Learning in Contact

Assistive Robotics

(Bhattacharjee, HRI 2020)
(Nanavati, HRI 2024)
(Gordon, HRI Companion 2024)

Modeling Information Gain

Tactile System Identification

Dynamic Object System Identification

Choose:

  • Robot Trajectory \(r[t]\)

Measure:

  • Contact Boolean \(c_t\)

  • Contact Normal \(\hat{n}_t\)

  • Proprioception

Find:

  • Object Geometry \(\theta^*\)

  • Object Pose \(x^*_T\)

Exploration with Expected Information Gain (EIG)

Learn; Compute

Observed Info \(\mathcal{I}\)

Sample + Simulate

Expected Fisher Info \(\mathcal{F}\)

\(\max EIG := \log\det\left(\mathcal{F}\mathcal{I}^{-1} + \mathbf{I}\right)\)

Choose actions where simulated, expected Fisher info is distinct from Observed info.

Learning with a Violation-Implicit Loss

Information Maximization In Action

The Technology/Application Cycle

Support

Inform

Active Learning in Contact

Assistive Robotics

(Gordon, Under Review)
(Gordon, CoRL 2023)
(Feng, ISRR 2019)
(Gordon, IROS 2020)
(Gordon, ICRA 2021)

User-Informed Metrics

Community-Based Design

User Studies Capture Metrics 

(Bhattacharjee, HRI 2020)

Trade-off between autonomy (with chance of error) and high-effort manual control.

What errors are tolerable?

User Studies Capture Diversity

Community-Based Participatory Design

(Gordon, HRI Companion 2024)

Community-Based Participatory Design

(Nanavati, HRI 2025)

The Technology/Application Cycle

Support

Inform

Active Learning in Contact

Assistive Robotics

Thank you!

DAIR Lab

Amal Nanavati

Tractable Adaptability

Ethan K. Gordon

Postdoc, University of Pennsylvania

PhD 2023, University of Washington

Active Learning for Contact-Rich Assistive Manipulation

Structure Through Expert-Defined Heuristics

  • Qualitative Taxonomy of Single-Utensil Bite Acquisition
  • Convert to Action Schema:
    • \(\mathbb{R}^{14} \times SO(3) \times S^2\)
    • Force and Torque Thresholds

Benefits: Interpretable, Continuity (Similar Numbers \(\rightarrow\) Similar Action)

(Gordon, CoRL 2023); (Bhattacharjee, R-AL 2019) 

Data-Driven Discretization

(Gordon, CoRL 2023)

Exploration vs. Exploitation: Contextual Bandits

(Gordon, IROS 2020); SPANet from (Feng, IJRR 2019)

Incorporating Haptic Information

(Gordon, ICRA 2021); (Bhattacharjee, R-AL 2019)

Classification with 50ms of 6DOF F/T Data

\(l_t = c_t^T\theta^* + \epsilon_\theta = p_t^T\phi^* + \epsilon_\phi\)

Optimize both simultaneously, regularizing them against each other.

Exploration vs. Exploitation: Contextual Bandits

(Gordon, ICRA 2021)

Active Learning for pHRI: Spinning the Flywheel

User-Informed:

Metrics

Priorities

Limitations

Contact-Rich Active Learning:

Model-Based

Policy Simplification

Multimodal Sensing

Community-Based:

System Design

Implementation

Pain Point Identification