Software that runs our robot arms
Pang, Terry and Mark
Communicating with the robot
Computer
drake-iiwa-driver
FRI
(Fast Research Interface)
via ethernet
IIWA_STATUS
RobotPlanRunner/Manager
IIWA_COMAND
LCM messages at 200Hz
Interesting research code
"Plans"
We'll talk about RobotPlanRunner/Manger today!
- What it does.
- How we implemented it.
- A multi-threaded version.
- A drake systems version.
Provided by Sammy@TRI
Examples of plans:
- Move joints to angles [0, 0.6, 0, -1.75, 0, 1, 0]
- Move EE to the left by 10cm while keeping orientation constant.
- Move EE downwards by 20cm and keep contact forces below 15N.
- Move as my mouse and keyboard command.
Past attempts
- 2018 summer: spartan::drake_robot_control
- By me and Lucas.
- Still using RigidBodyPlant.
- 2019 spring: PR into drake by me.
- Based on the drake systems framework.
- 2020 spring: PR into drake by SiyuanFeng@TRI
- Also based on the drake systems framework.
- Should be similar to what TRI has been using in their internal repo (Anzu?).
- Wei has his own version for kPAM 2.0.
- 2021 spring: new repo in RobotLocomotion:
- Terry, Mark and I are actively working on it.
- Aiming to be a standalone package just for receiving plans and sending robot commands, without ROS as a dependency.
- "Interesting research code" should use find_package(...) to locate and link against it.
PlanManager is similar to ROS action.
server
client
- Workflow of ROS actions:
- Launch an action server and an action client.
- Client sends a Goal to server.
- Client continues to do some work.
- Client calls WaitForActionResult(), which blocks until the server says the Goal has been reached.
- Client can also requests to abort the Goal while it's running.
- Work flow of PlanManager is similar.
- Launch PlanManager and PlanManagerClient.
- Client sends a Plan to server.
- Client does some work.
- Client calls wait_for_result().
- Client can also call abort() to cancel the current plan.
- Communication between client and server is through ZMQ, which cannot be logged by lcm-spy or ROS. It this easy to fix?
- PlanManagerClient API:
import time
client = PlanManagerZmqClient()
client.make_and_send_plan(...) # sends plan to server.
time.sleep(1.0) # do some work.
client.wait_for_result() # wait for current plan to finish.
import time
client = PlanManagerZmqClient()
client.make_and_send_plan(...) # sends plan to server.
time.sleep(1.0) # do some work.
client.abort() # abort current plan.
send plan and wait for result.
send plan and abort.
Plans are objects that know how to compute commands.
class PlanBase {
public:
/* other stuff */
virtual void Step(const State &state, double control_period, double t,
Command *cmd) const = 0;
};
class JointSpaceTrajectoryPlan : public PlanBase {
public:
/* other stuff */
void Step(const State &state, double control_period, double t,
Command *cmd) const override;
};
class TaskSpaceTrajectoryPlan : public PlanBase {
public:
/* other stuff */
void Step(const State &state, double control_period, double t,
Command *cmd) const override;
};
- Every Plan has a Step function that:
- Takes in current robot state and time.
- Computes command to the robot.
- Users can implement concrete Plan types by inheriting from PlanBase.
- The client sends a message of type lcmt_robot_plan to the server, which constructs the plan object from the message.
DrakeVisualizer
LCM spy
MockStationSimulation
PlanManager Server
PlanManager Client
PlanManager server is a finite state machine.
- The state machine consists of functions that need to behave differently in different states.
- QueueNewPlan()
- [INIT] discard received plan.
- [IDLE] queue received plan, transitions [RUNNING].
- [RUNNING] discard received plan.
- [ERROR] discard received plan.
- GetCurrentPlan()
- [INIT] return nullptr.
- [IDLE] return nullptr.
- [RUNNING] return the currently active plan. If the currently active plan has been running for longer than its duration, pop the plan from queue and transitions back to [IDLE].
- [ERROR] return nullptr.
- etc.
- QueueNewPlan()
State machines are tables of functions! (State design pattern)
NoMoney | HasMoney | Vending | |
---|---|---|---|
Buy | Do nothing. | Start vending, transition to [Vending]. | Wait for vending to finish, then transition to [NoMoney]. |
Cancel | Do nothing. | Return money. | Do nothing. |
Example: vending machine with only one kind of item and two buttons
- Can hold either 1 or 0 coins.
Buy
Cancel
Coding a state machine.
NoMoney | HasMoney | Vending | |
---|---|---|---|
Buy | Do nothing. | Start vending, transition to [Vending]. | Wait for vending to finish, then transition to [NoMoney]. |
Cancel | Do nothing. | Return money. | Do nothing. |
// vending_machine.h
enum State {kNoMoney, kHasMoney, kVending}
class VendingMachine {
public:
void Buy() {
if (state_ == kNoMoney) {
/* do nothing */
return;
}
if (state_ == kHasMoney) {
state_ = kVending;
VendOneBottle();
return;
}
if (state_ == kVending) {
WaitForVendingToFinish();
state_ = kNoMoney;
return;
}
}
void Cancel() {
/* implementation */
}
private:
State state_;
}
// state_base.h
class StateBase {
public:
virtual void Buy(VendingMachine *vm) = 0;
virtual void Cancel(VendingMachine *vm) = 0;
}
// state_no_money.h
#include "state_base.h"
class StateNoMoney : public StateBase {
public:
void Buy(VendingMachine *vm) override {
/* do nothing */
return;
}
void Cancel(VendingMachine *vm) override {
/* implementation */
}
}
// state_has_money.h
#include "state_base.h"
class StateHasMoney : public StateBase {
public:
void Buy(VendingMachine *vm) override {
vm->ChangeState(StateVending::Instance());
VendOneBottle();
return;
}
void Cancel(VendingMachine *vm) override {
/* implementation */
}
}
// state_vending.h
#include "state_base.h"
class StateVending : public StateBase {
public:
void Buy(VendingMachine *vm) override {
WaitForVendingToFinish();
vm->ChangeState(StateNoMoney::Instance());
return;
}
void Cancel(VendingMachine *vm) override {
/* implementation */
}
}
// vending_machine.h
#include state_base.h
class VendingMachine {
public:
void Buy() {
state_->Buy(this);
}
void Cancel() {
state_->Cancel(this);
}
void ChangeState(StateBase *new_state) {
state_ = new_state;
}
private:
StateBase *state_;
}
Advantage:
Adding new states or changing existing states do not require re-compiling the entire state machine.
PlanManager: multi-threaded version.
IIWA command thread
IIWA_STATUS
IIWA_COMMAND
Plan thread
PLAN_STATUS
Abort thread
Print thread
state_machine->GetCurrentPlan()
state_machine->GetCurrentPlanUpTime()
state_machine->CommandHasError()
state_machine->QueueNewPlan()
state_machine->GetCurrentState()
state_machine->PrintState()
state_machine->AbortPlans()
State machine is shared among the four threads of the server, and locked by a mutex when used.
Plan Status subscription thread
plan_status
main thread
Server
Client
- LCM
- ZMQ Request/Reply
- ZMQ Publish/Subscribe
ROBOT_PLAN
lcm-typed messages are sent through ZMQ instead of LCM.
- Spartan has one thread for subscribing to and saving IIWA_STATUS, another one for computing commands. The command thread needs to be woken up by a conditional variable tied to the status thread (convoluted!).
- Decoding IIWA_STATUS should be a fairly lightweight operation(?) So I put command computation into the callback of IIWA_STATUS subscription.
PlanManager: Drake Systems version
- Implemented by wrapping a LeafSystem around the state machine.
- This may not be the recommended way to use drake systems....
- Doesn't support all features of ROS actions.
- Client is blocked when server is executing a Plan.
- Client gets Plan Status update after the plan is finished.
Maynot be the drake-perfect way to write down a system...
But we don't really know the way drake prefers.
?
Performance comparison: multi-threaded and drake systems
- An IIWA_COMMAND message is published in the callback function of IIWA_STATUS subscription.
- Therefore, every IIWA_COMMAND message has a corresponding IIWA_STATUS message.
- Delay between a IIWA_COMMAND message and its corresponding IIWA_STATUS message is the time spent in PlanManager.
Delay between IIWA_COMMAND messages and their corresponding IIWA_STATUS messages.
- Drake system seems to add quite a bit of overhead (am I doing this wrong?).
- The biggest advantage of drake systems is generating deterministic simulation results.
Receive IIWA_STATUS
Publish IIWA_COMMAND
callback: decode message, compute command
Delay
Some drake systems profiling results
Top time consumers
Time spent in output port evaluation (which should call Plan::Step) is only a small fraction of the time in Simulator.
Title Text
- Bullet One
- Bullet Two
- Bullet Three
drake systems
multi-threaded
TODOs and discussions
- Best way to log ZMQ messages?
- Publish an LCM message along with every ZMQ message sent, so that they can be captured by lcm-spy?
- Best way to catch bugs in multi-threaded code?
- Design a sequence that covers all state transitions, and run it for a long time?
- Avoid multi-threading altogether?
robot_plan_runner
By Pang
robot_plan_runner
- 310