Manipulation in Clutter

Part 3:                                                     

Programming Tasks

MIT 6.4210/2:

Robotic Manipulation

Fall 2022, Lecture 10

Follow live at

(or later at


  • Discuss final projects (proposals first draft due next week)
  • Contact wrench cones visualization
  • Authoring complex tasks
    • State machines, behavior trees, PDDL
    • Mention advanced testing strategies
  • Addition: $$F^{B_p}_{\text{total},C}= \sum_i F^{B_p}_{i,C}.$$
  • Shifting: $$\qquad \qquad \qquad f^{B_q}_C = f^{B_p}_C, \qquad \tau^{B_q}_C = \tau^{B_p}_C + {}^{B_q}p^{B_p}_C \times f^{B_p}_C.$$

  • Changing the expressed-in frame: $$\qquad \qquad  f^{B_p}_D = {}^DR^C f^{B_p}_C, \qquad \tau^{B_p}_D = {}^DR^C \tau^{B_p}_C.$$

F^{B_p}_{{\text{name}},C} = \begin{bmatrix} \tau^{B_p}_{\text{name},C} \\ f^{B_p}_{\text{name},C} \end{bmatrix}

Spatial forces

note the cross product

Friction cones and contact wrenches

Check yourself:

Why is it a cone?  Not just one force vector?


How does the spatial algebra apply to the cones?

\mathcal{K}^{C}_C = \begin{bmatrix} 0 \\ 0 \\ 0 \\ f^C_{C_x} \\ f^{C}_{C_y} \\ f^C_{C_z} \end{bmatrix}\text{, s.t. }\sqrt{[f^C_{C_x}]^2 + [f^C_{C_y}]^2} \le \mu f^C_{C_z}

Wrench Cones Algebra

  • Shifting:

\mathcal{K}^{B_q}_C = \begin{bmatrix} I & [{}^{B_q}p^{B_p}_C]_\times \\ 0 & I \end{bmatrix} \mathcal{K}^{B_p}_C
\mathcal{K}^{B_p}_{\text{total},C}= \mathcal{K}^{B_p}_{0,C} \oplus \mathcal{K}^{B_p}_{1,C} \oplus \cdots
  • Addition:

Minkowski sum

Wrench Cones Algebra

I made an interactive visualization...

"Clutter clearing" (an early project at TRI)

class PlannerState(Enum):
    GO_HOME = 4
class Planner(LeafSystem):
    def __init__(self, plant):
        self.DeclarePeriodicUnrestrictedUpdateEvent(0.1, 0.0, self.Update)
        self._mode_index = self.DeclareAbstractState(

    def Update(self, context, state):
        mode = context.get_abstract_state(int(self._mode_index)).get_value()
class Planner(LeafSystem):
    def __init__(self, plant):
        self._traj_X_G_index = self.DeclareAbstractState(
        self._traj_wsg_index = self.DeclareAbstractState(

    def Update(self, context, state):
        X_G, times = MakeGripperFrames(X_G, t0=context.get_time())
        traj_X_G = MakeGripperPoseTrajectory(X_G, times)
        traj_wsg_command = MakeGripperCommandTrajectory(times)


Failure modes

I would group them into:

  1. MultibodyPlant can crash on startup
  2. Motion planning. Collisions and DiffIK limitations.
  3. Perception.
    • grasp far from object center of mass,
    • double-picks,
    • phantoms...
  4. More than pick and place.
    • objects stuck in the corner
    • objects too big for the hand

(Advanced) Full-stack integration testing

Finding subtle bugs

Programming behaviors in the simulation loop

The MIT Leg Lab Hopping Robots

1. approach, s.t. mug is in view.
2. visual servo
3. insert to grasp
4. retract
5. move to pre-place
6. place (force termination)

Picking up a mug (original, hand-crafted version)

work by Siyuan Feng et al. at TRI

1. general approach the plate, s.t. it's roughly in view for the wrist camera

2. visual servo for alignment

3. assuming known geometry and well alignment wrt the plates, close the fingers to the right amount, insert one finger tip between the plates.

4. push the top plate towards one side of the sink with the finger tip. (terminate with force)

5. kind of scoop the gripper into a solid grasp, while pushing onto the plate (terminate with force)

6. retract and go to pre-place configuration. 

7. insert to place. (terminate with force)

Picking up a plate (original, hand-crafted version)

work by Siyuan Feng et al. at TRI

Behavior Trees

Relative to state machines:

  • Easier to compose
  • Factored representation

"Skill" Frameworks (aka "Motion Primitives")

STRIPS problem definitions

  • An initial state;
  • The specification of the goal states – situations which the planner is trying to reach;
  • A set of actions. For each action, the following are included:
    • preconditions (what must be established before the action is performed);
    • postconditions (what is established after the action is performed).

PDDL = Planning Domain Definition Language


Example: Gripper task with four balls

There is a robot that can move between two rooms and pick up or drop balls with either of his two arms. Initially, all balls and the robot are in the first room. We want the balls to be in the second room.

  • Objects: The two rooms, four balls and two robot arms.
  • Predicates: Is x a room? Is x a ball? Is ball x inside room y? Is robot arm x empty? [...]
  • Initial state: All balls and the robot are in the first room. All robot arms are empty. [...]
  • Goal specification: All balls must be in the second room.
  • Actions/Operators: The robot can move between rooms, pick up a ball or drop a ball.

PDDL = Planning Domain Definition Language

(define (domain gripper-strips)
   (:predicates (room ?r)
		(ball ?b)
		(gripper ?g)
		(at-robby ?r)
		(at ?b ?r)
		(free ?g)
		(carry ?o ?g))

   (:action move
       :parameters  (?from ?to)
       :precondition (and  (room ?from) (room ?to) (at-robby ?from))
       :effect (and  (at-robby ?to)
		     (not (at-robby ?from))))

   (:action pick
       :parameters (?obj ?room ?gripper)
       :precondition  (and  (ball ?obj) (room ?room) (gripper ?gripper)
			    (at ?obj ?room) (at-robby ?room) (free ?gripper))
       :effect (and (carry ?obj ?gripper)
		    (not (at ?obj ?room)) 
		    (not (free ?gripper))))

   (:action drop
       :parameters  (?obj  ?room ?gripper)
       :precondition  (and  (ball ?obj) (room ?room) (gripper ?gripper)
			    (carry ?obj ?gripper) (at-robby ?room))
       :effect (and (at ?obj ?room)
		    (free ?gripper)
		    (not (carry ?obj ?gripper)))))

PDDL = Planning Domain Definition Language

(define (problem strips-gripper4)
   (:domain gripper-strips)
   (:objects rooma roomb ball1 ball2 ball3 ball4 left right)
   (:init (room rooma)
          (room roomb)
          (ball ball1)
          (ball ball2)
          (ball ball3)
          (ball ball4)
          (gripper left)
          (gripper right)
          (at-robby rooma)
          (free left)
          (free right)
          (at ball1 rooma)
          (at ball2 rooma)
          (at ball3 rooma)
          (at ball4 rooma))
   (:goal (and (at ball1 roomb)
               (at ball2 roomb)
               (at ball3 roomb))))

/// Base type for all action primitives
class ActionPrimitiveInterface
  virtual bool IsCandidate(const State& state) const = 0;
  virtual Container GetOutcomes(const State& state) const = 0;
  virtual double EstimateActionCost(
      const State& current, const State& target) const = 0;
  virtual Container Execute(const State& state) = 0;
  virtual double Ranking() const = 0;

work by Calder Phillips-Grafflin et al. at TRI

DishTaskState(const DishState& active_dish_state,
              int32_t clean_items_put_away, int32_t clean_items_in_dishwasher,
              int32_t dirty_items_in_dishwasher,
              int32_t dirty_items_available,
              const DishwasherState& current_dishwasher_state);

DishState(const RigidTransformd& dish_pose, DishType dish_type,
          DishProgress dish_progress, RackRequirement rack_required);

DishwasherState(bool known, bool door_open, bool lower_rack_out,
                bool upper_rack_out, bool silverware_rack_out,
                bool lower_rack_full, bool upper_rack_full,
                bool silverware_rack_full, bool started, bool soap_loaded);
enum class DishType : uint8_t {
  Unknown = 0x00,
  Mug = 0x01,
  Plate = 0x02,
  Silverware = 0x03

enum class RackRequirement : uint8_t {
  Unknown = 0x00,
  LowerRack = 0x01,
  UpperRack = 0x02,
  SilverwareRack = 0x03
enum class DishProgress : uint8_t {
  Unknown = 0x00,
  Unmanipulable = 0x01,
  LikelyManipulable = 0x02,
  Sensed = 0x03,
  OptimisticOriented = 0x04,
  Oriented = 0x05,
  Placed = 0x06,
  Released = 0x07

work by Calder Phillips-Grafflin et al. at TRI

We replan before each action to handle unexpected outcomes...

Discrete planning (requires "symbol grounding")

Perception system summarizes environment state

Actions change the symbolic state

Now we often want to use learned skills in a task-level framework

Lecture 10: Programming Tasks

By russtedrake

Lecture 10: Programming Tasks

MIT Robotic Manipulation Fall 2022

  • 969