Amazon CoRo June Update

Jun 19, 2025

Adam Wei

Agenda

  1. Algorithm Overview + Implementations
  2. Sim + Real Experiments
  3. GCS + RRT Experiments
  4. Next Directions

Part 1

Ambient Diffusion Recap

and Implementations

Ambient Diffusion

Learning from "clean" (\(p_0\)) and "corrupt" data (\(q_0\))

Learning in the high-noise regime

\(\exists \sigma_{min}\) s.t. \(d_\mathrm{TV}(p_{\sigma_{min}}, q_{\sigma_{min}}) < \epsilon\)

  • Only use samples from \(q_0\) to train denoisers with \(\sigma > \sigma_{min}\)
  • Corruption level of each datapoint determines its utility

\(\sigma=1\)

\(\sigma=0\)

\(\sigma=\sigma_{min}\)

Clean Data

Corrupt Data

Learning in the high-noise regime

\(\exists \sigma_{min}\) s.t. \(d_\mathrm{TV}(p_{\sigma_{min}}, q_{\sigma_{min}}) < \epsilon\)

(in this example, \(\epsilon = 0.05\))

\(p_0\)

\(q_0\)

Learning in the high-noise regime

Factors that affect \(\sigma_{min}\):

  • Similarity of \(p_0\), \(q_0\)
  • Nature of the corruption from \(p_0\) to \(q_0\)
  • Contracting \(q_{\sigma_{min}}\) towards \(p_{\sigma_{min}}\) "masks" corruption between \(p_0\) and \(q_0\)

... but also destroys useful signal

Contraction as masking

  • Contracting \(q_{\sigma_{min}}\) towards \(p_{\sigma_{min}}\) "masks" corruption between \(p_0\) and \(q_0\)

... but also destroys useful signal

More info lost

Learning in the high-noise regime

\(\exists \sigma_{min}\) s.t. \(d_\mathrm{TV}(p_{\sigma_{min}}, q_{\sigma_{min}}) < \epsilon\)

  • Only use samples from \(q_0\) to train denoisers with \(\sigma > \sigma_{min}\)
  • Corruption level of each datapoint determines its utility

\(\sigma=1\)

\(\sigma=0\)

\(\sigma=\sigma_{min}\)

Denoising Loss

Denoising OR Ambient Loss

Loss Function (for \(x_0\sim q_0\))

\(\mathbb E[\lVert h_\theta(x_t, t) + \frac{\sigma_{min}^2\sqrt{1-\sigma_{t}^2}}{\sigma_t^2-\sigma_{min}^2}x_{t} - \frac{\sigma_{t}^2\sqrt{1-\sigma_{min}^2}}{\sigma_t^2-\sigma_{min}^2} x_{t_{min}} \rVert_2^2]\)

Ambient Loss

Denoising Loss

\(x_0\)-prediction

\(\epsilon\)-prediction

(assumes access to \(x_0\))

(assumes access to \(x_{t_{min}}\))

\(\mathbb E[\lVert h_\theta(x_t, t) - x_0 \rVert_2^2]\)

\(\mathbb E[\lVert h_\theta(x_t, t) - \epsilon \rVert_2^2]\)

\(\mathbb E[\lVert h_\theta(x_t, t) - \frac{\sigma_t^2 (1-\sigma_{min}^2)}{(\sigma_t^2 - \sigma_{min}^2)\sqrt{1-\sigma_t^2}}x_t + \frac{\sigma_t \sqrt{1-\sigma_t^2}\sqrt{1-\sigma_{min}^2}}{\sigma_t^2 - \sigma_{min}^2}x_{t_{min}}\rVert_2^2]\)

Sanity check: ambient loss \(\rightarrow\) denoising loss as \(\sigma_{min} \rightarrow 0\)

Experiments: Test all 4

Implementation Details

  1. Sample noise level first, then data points
  2. Ambient loss "buffer" for training stability
  3. Contract each datapoint once
  4. Other small details...

Implementation details: included for completeness...

Learning in the low-noise regime

Can use OOD data to learn in the low-noise regime...

... more on this next time

Part 2

Sim + Real Experiments

"Clean" Data

"Corrupt" Data

\(|\mathcal{D}_T|=50\)

\(|\mathcal{D}_S|=2000\)

Eval criteria: Success rate for planar pushing across 200 randomized trials

Datasets and Task

Results

Experiments

Choosing \(\sigma_{min}\): Swept several values on the dataset level

Loss function: Tried all 4 combinations of

{\(x_0\)-prediction, \(\epsilon\)-prediction} x {denoising, ambient}

Preliminary Observations

(will present best results on next slide)

  • \(\epsilon\)-prediction > \(x_0\)-prediction and denosing > ambient loss
  • Small (but non-zero) \(\sigma_{min}\) performed best
  • Ambient diffusion scales slightly better with more sim data than cotraining

Results

\(|\mathcal{D}_S| = 50\), \(|\mathcal{D}_S| = 2000\), \(\epsilon\)-prediction with denoising loss

Results

\(|\mathcal{D}_S| = 50\), \(|\mathcal{D}_S| = 2000\), \(\epsilon\)-prediction with denoising loss

Part 3

Maze Experiments

Experiment: RRT vs GCS

GCS

(clean)

RRT

(clean)

Task: Cotrain on GCS and RRT data

Goal: Sample clean and smooth GCS plans

  • Example of high frequency corruption
  • Common in robotics
    • Low quality teleoperation or data gen

Baselines

Success Rate: 50%

Average Jerk Squared: 7.5k

100 GCS Demos

Success Rate: 99%

Average Jerk Squared: 2.5k

4950 GCS Demos

Success Rate: 100%

Average Jerk Squared: 17k

4950 RRT Demos

Cotraining vs Ambient Diffusion

Success Rate: 91%

Average Jerk Squared: 12.5k

Cotraining: 100 GCS Demos, 5000 RRT Demos, \(x_0\)-prediction

Success Rate: 98%

Average Jerk Squared: 5.5k

Ambient: 100 GCS Demos, 5000 RRT Demos, \(x_0\)-prediction ambient loss

See plots for qualitative results

Part 4

Next Experiments

North Star Goal

Main goal: Cotrain with ambient diffusion using internet data (Open-X, AgiBot, etc) and/or simulation data

... but first some "stepping stone" experiments

Stepping stone experiments are designed to

  • Sanity check and debug implementations along the way
  • Highlight and isolate specific components of the algorithm
  • Improve exposition in the final paper

Experiment: Extend Maze to MP

Corrupt data: RRT/low quality data from 20-30 different environments

Clean data: GCS data from 2-3 environments

Performance metric: Success rate + smoothness metrics in seen/unseen environments

Purpose: Test learning in the high-noise regime

Experiment: Lu's Data

  • Same task + plan structure, different robot embodiment
    • Plans are high quality
    • Trajectories (EE-space) have embodiment gap (lower quality)

Purpose: Test learning in the high-noise regime

Experiment: Bin Sorting

Task: Pick-and-place objects into specific bins

Clean Data: Demos with the correct logic

Corrupt Data: incorrect logic, Open-X, pick-and-place datasets, datasets with bins, etc

  • Motions may be reasonable and informative
  • But the decision making is incorrect

Purpose: Test learning in the low-noise regime

Experiment Document

Details outlined in a google doc. Can share link privately if interested

Amazon CoRo July

By weiadam

Amazon CoRo July

Slides for my talk at the Amazon CoRo Symposium 2025. For more details, please see the paper: https://arxiv.org/abs/2503.22634

  • 58