Software that runs our robot arms

Pang, Terry and Mark

Communicating with the robot

Computer

drake-iiwa-driver

FRI

(Fast Research Interface)

via ethernet

IIWA_STATUS

RobotPlanRunner/Manager

IIWA_COMAND

LCM messages at 200Hz

Interesting research code

"Plans"

We'll talk about RobotPlanRunner/Manger today!

  • What it does.
  • How we implemented it. 
    • A multi-threaded version.
    • A drake systems version.

Provided by Sammy@TRI

Examples of plans:

  • Move joints to angles [0, 0.6, 0, -1.75, 0, 1, 0]
  • Move EE to the left by 10cm while keeping orientation constant.
  • Move EE downwards by 20cm and keep contact forces below 15N. 
  • Move as my mouse and keyboard command. 

Past attempts

  • 2018 summer: spartan::drake_robot_control
    • By me and Lucas.
    • Still using RigidBodyPlant. 
  • 2019 spring: PR into drake by me.
    • Based on the drake systems framework.
  • 2020 spring: PR into drake by SiyuanFeng@TRI
    • Also based on the drake systems framework.
    • Should be similar to what TRI has been using in their internal repo (Anzu?).
  • Wei has his own version for kPAM 2.0.
  • 2021 spring: new repo in RobotLocomotion:
    • Terry, Mark and I are actively working on it.
    • Aiming to be a standalone package just for receiving plans and sending robot commands, without ROS as a dependency.
    • "Interesting research code" should use find_package(...) to locate and link against it. 

PlanManager is similar to ROS action.

server

client

  • Workflow of ROS actions​:
    1. Launch an action server and an action client.
    2. ​Client sends a Goal to server.
    3. Client continues to do some work.
    4. Client calls WaitForActionResult(), which blocks until the server says the Goal has been reached. 
    5. Client can also requests to abort the Goal while it's running.
  • Work flow of PlanManager is similar.  
    1. Launch PlanManager and PlanManagerClient.
    2. Client sends a Plan to server.
    3. Client does some work.
    4. Client calls wait_for_result().
    5. Client can also call abort() to cancel the current plan.
  • Communication between client and server is through ZMQ, which cannot be logged by lcm-spy or ROS. It this easy to fix?
  • PlanManagerClient API:

 

import time

client = PlanManagerZmqClient()

client.make_and_send_plan(...)  # sends plan to server.
time.sleep(1.0)  # do some work.
client.wait_for_result()  # wait for current plan to finish.
import time

client = PlanManagerZmqClient()

client.make_and_send_plan(...)  # sends plan to server.
time.sleep(1.0)  # do some work.
client.abort()  # abort current plan.

send plan and wait for result.

send plan and abort.

Plans are objects that know how to compute commands.

class PlanBase {
public:
  /* other stuff */

  virtual void Step(const State &state, double control_period, double t,
                    Command *cmd) const = 0;
};


class JointSpaceTrajectoryPlan : public PlanBase {
public:
 /* other stuff */
 
 void Step(const State &state, double control_period, double t,
                    Command *cmd) const override;
};

class TaskSpaceTrajectoryPlan : public PlanBase {
public:
  /* other stuff */
  
  void Step(const State &state, double control_period, double t,
                    Command *cmd) const override;
};
  • Every Plan has a Step function that:
    • Takes in current robot state and time.
    • Computes command to the robot.
  • Users can implement concrete Plan types by inheriting from PlanBase.
  • The client sends a message of type lcmt_robot_plan to the server, which constructs the plan object from the message. 

DrakeVisualizer

LCM spy

MockStationSimulation

PlanManager Server

PlanManager Client

PlanManager server is a finite state machine.

  • The state machine consists of functions that need to behave differently in different states.
    • QueueNewPlan()
      • [INIT] discard received plan.
      • [IDLE] queue received plan, transitions [RUNNING].
      • [RUNNING] discard received plan.
      • [ERROR] discard received plan.
    • GetCurrentPlan()
      • [INIT] return nullptr.
      • [IDLE] return nullptr.
      • [RUNNING] return the currently active plan. If the currently active plan has been running for longer than its duration, pop the plan from queue and transitions back to [IDLE].
      • [ERROR] return nullptr.
    • etc.

State machines are tables of functions! (State design pattern)

NoMoney HasMoney Vending
Buy Do nothing. Start vending, transition to [Vending]. Wait for vending to finish, then transition to [NoMoney].
Cancel Do nothing. Return money. Do nothing.

Example: vending machine with only one kind of item and two buttons

  • Can hold either 1 or 0 coins.

Buy

Cancel

Coding a state machine.

NoMoney HasMoney Vending
Buy Do nothing. Start vending, transition to [Vending]. Wait for vending to finish, then transition to [NoMoney].
Cancel Do nothing. Return money. Do nothing.
// vending_machine.h

enum State {kNoMoney, kHasMoney, kVending}

class VendingMachine {
public:
  void Buy() {
    if (state_ == kNoMoney) {
      /* do nothing */
      return;
    }
    if (state_ == kHasMoney) {
      state_ = kVending;
      VendOneBottle();
      return;
    }
    if (state_ == kVending) {
      WaitForVendingToFinish();
      state_ = kNoMoney;
      return;
    }
  }
  
  void Cancel() {
    /* implementation */
  }
  
private:
  State state_;
}
// state_base.h

class StateBase {
public:
  virtual void Buy(VendingMachine *vm) = 0;
  virtual void Cancel(VendingMachine *vm) = 0;
}
// state_no_money.h

#include "state_base.h"

class StateNoMoney : public StateBase {
public:
  void Buy(VendingMachine *vm) override {
    /* do nothing */
    return;
  }
  void Cancel(VendingMachine *vm) override {
    /* implementation */
  }
}
// state_has_money.h

#include "state_base.h"

class StateHasMoney : public StateBase {
public:
  void Buy(VendingMachine *vm) override {
    vm->ChangeState(StateVending::Instance());
    VendOneBottle();
    return;
  }
  void Cancel(VendingMachine *vm) override {
    /* implementation */
  }
}
// state_vending.h

#include "state_base.h"

class StateVending : public StateBase {
public:
  void Buy(VendingMachine *vm) override {
    WaitForVendingToFinish();
    vm->ChangeState(StateNoMoney::Instance());
    return;
  }
  void Cancel(VendingMachine *vm) override {
    /* implementation */
  }
}
// vending_machine.h

#include state_base.h

class VendingMachine {
public:
void Buy() {
  state_->Buy(this);
}

void Cancel() {
  state_->Cancel(this);
}

void ChangeState(StateBase *new_state) {
  state_ = new_state;
}

private:
  StateBase *state_;

}

Advantage:

Adding new states or changing existing states do not require re-compiling the entire state machine.

PlanManager: multi-threaded version.

IIWA command thread

IIWA_STATUS

IIWA_COMMAND

Plan thread

PLAN_STATUS

Abort thread

Print thread

state_machine->GetCurrentPlan()

state_machine->GetCurrentPlanUpTime()

state_machine->CommandHasError()

state_machine->QueueNewPlan()

state_machine->GetCurrentState()

state_machine->PrintState()

state_machine->AbortPlans()

State machine is shared among the four threads of the server, and locked by a mutex when used. 

Plan Status subscription thread

plan_status

main thread

Server

Client

- LCM

- ZMQ Request/Reply

- ZMQ Publish/Subscribe

ROBOT_PLAN

lcm-typed messages are sent through ZMQ instead of LCM.

  • Spartan has one thread for subscribing to and saving IIWA_STATUS, another one for computing commands. The command thread needs to be woken up by a conditional variable tied to the status thread (convoluted!).
  • Decoding IIWA_STATUS should be a fairly lightweight operation(?) So I put command computation into the callback of IIWA_STATUS subscription.

PlanManager: Drake Systems version

  • Implemented by wrapping a LeafSystem around the state machine.
    • This may not be the recommended way to use drake systems....
  • Doesn't support all features of ROS actions.
    • Client is blocked when server is executing a Plan.
    • Client gets Plan Status update after the plan is finished.

Maynot be the drake-perfect way to write down a system...

But we don't really know the way drake prefers.

?

Performance comparison: multi-threaded and drake systems

  • An IIWA_COMMAND message is published in the callback function of IIWA_STATUS subscription. 
  • Therefore, every IIWA_COMMAND message has a corresponding IIWA_STATUS message.
  • Delay between a IIWA_COMMAND message and its corresponding IIWA_STATUS message is the time spent in PlanManager. 

Delay between IIWA_COMMAND messages and their corresponding IIWA_STATUS messages.

  • Drake system seems to add quite a bit of overhead (am I doing this wrong?). 
  • The biggest advantage of drake systems is generating deterministic simulation results. 

Receive IIWA_STATUS

t

Publish IIWA_COMMAND

callback: decode message, compute command

Delay

Some drake systems profiling results

Top time consumers

Time spent in output port evaluation (which should call Plan::Step) is only a small fraction of the time in Simulator.

Title Text

  • Bullet One
  • Bullet Two
  • Bullet Three

drake systems

multi-threaded

TODOs and discussions

  • Best way to log ZMQ messages? 
    • Publish an LCM message along with every ZMQ message sent, so that they can be captured by lcm-spy?
  • Best way to catch bugs in multi-threaded code?
    • Design a sequence that covers all state transitions, and run it for a long time?
    • Avoid multi-threading altogether?

robot_plan_runner

By Pang

robot_plan_runner

  • 310