Raft-rs

A muturing Rust implementation of Raft

 

An Introduction by

Andrew (Hoverbear) Hobden

MIT Licensed

No CLAs

100% Community Guided

New Contributor Friendly

Questions Welcome
(Even during the talk!)

(Currently) Nightly Only & Not on Crates.io

Friends of the Tree

@danburkert, @james-darkfox, @tschottdorf

@ongardie, @carllerche, @dwrensha,

#rust@irc.mozilla.org

The Big Picture

History

  • 1st prototype developed for a Distributed Systems class at the University of Victoria. ~v0.9 nightly Rust.
  • Weekly(ish) breaking changes, fun!
  • Very naive, simple implementation on (mostly) stdlib.

A few finals later...

  • Numerous libraries discovered or emerged during development.
  • People started coding with me! WOW.
  • 2nd prototype planned using emerging ecosystem to solve problems that existed.
  • Crafted over time, without a hard deadline, resulting in better decisions and code.
  • Mostly functional today, and getting better daily!

Goals

  • Fast
  • Unopinionated
  • Correct
  • General Purpose

Non-Goals

  • Library/Idiom dictation
  • Lock-in

Not just a key/value store! Build what you need.

Raft-rs represents more of an RPC and replication framework than a database or cache.

MIO

  • Powerful asynchronous event loop.
  • Enables single-threaded design.
  • Interesting and Low Level.
  • Way more fun than Node!
  • ☹ Poor Windows support
  • Fast Cerealization Protocol.
  • No serialize/deserialize.
  • Messages represent wire format.
  • Cross platform schemas.
  • ☹ Poor async support
    (Dan forked it to add async! Improvements here soon!)
  • It's bors!
  • Keeps master green.
  • Encourages code review.

How It All Connects

Built
For You

Client

  • Provides all communication between program and the cluster.
  • Clients can be inside or outside of the cluster.
  • Calls block until it is appropriate to respond.
  • Design your own interactions with the state machine!

A Simple API

// Create a Client associated with a cluster.
let mut client = raft::Client::new(cluster);
// Passes through durable log, is committed, may mutate state machine.
let prop_res = client.propose(msg);
// Immutable access to state machine.
let query_res = client.query(msg);

Server

  • Acts as a MIO reactor.
  • Virtually all networking and messaging code.
  • Buffers incoming messages and dispatches to associated Consensus Module.
  • Acts appropriately based on Actions returned by the module.
  • No Raft algorithm specific logic.

A(nother) Simple API

// Spawn a server. This will block the current thread as long as it's running.
Server::run(id, addr, peers, persistent_log, state_machine).unwrap();

Consensus

  • Contains all code related to the core Raft Algorithm.
  • Zero networking or filesystem code. (Easy testing!)
  • Handles elections, replication, committing, etc.
  • Exclusive interfaces to the State Machine and Log.
  • Hands Actions back to the Server.

Doing It Yourself

Persistent Log

  • Represents the replicated, persistent log of your application.

  • Has a strong ordering such that A → B → C and should only act to store information.

  • Entries placed into the log should not be acted on by the consuming application.

  • Generally not application specific.

  • Warning: API is not set in stone. Expect improvements!

Persistent Log API

/// Returns the latest known term.
fn current_term(&self) -> result::Result<Term, Self::Error>;

/// Sets the current term to the provided value. The provided term must be greater than
/// the current term. The `voted_for` value will be reset`.
fn set_current_term(&mut self, term: Term) -> result::Result<(), Self::Error>;

/// Increment the current term. The `voted_for` value will be reset.
fn inc_current_term(&mut self) -> result::Result<Term, Self::Error>;

/// Returns the candidate id of the candidate voted for in the current term (or none).
fn voted_for(&self) -> result::Result<Option<ServerId>, Self::Error>;

/// Sets the candidate id voted for in the current term.
fn set_voted_for(&mut self, server: ServerId) -> result::Result<(), Self::Error>;

/// Returns the index of the latest persisted log entry (0 if the log is empty).
fn latest_log_index(&self) -> result::Result<LogIndex, Self::Error>;

/// Returns the term of the latest persisted log entry (0 if the log is empty).
fn latest_log_term(&self) -> result::Result<Term, Self::Error>;

/// Returns the entry at the provided log index.
fn entry(&self, index: LogIndex) -> result::Result<(Term, &[u8]), Self::Error>;

/// Appends the provided entries to the log beginning at the given index.
fn append_entries(&mut self, from: LogIndex, entries: &[(Term, &[u8])]) -> result::Result<(), Self::Error>;

State Machine

  • Represents the replicated state of the cluster.
  • Consistent across all nodes.
  • Is not mutated without going through Raft.
  • Responsible for triggering log compaction.
    • (We've yet to do this!)
  • Warning: API is not set in stone. Expect improvements!
/// Applies a command to the state machine.
/// Returns an application-specific result value.
fn apply(&mut self, command: &[u8]) -> Vec<u8>;

/// Queries a value of the state machine. Does not go through the durable log, or mutate the
/// state machine.
/// Returns an application-specific result value.
fn query(&self, query: &[u8]) -> Vec<u8>;

/// Take a snapshot of the state machine.
fn snapshot(&self) -> Vec<u8>;

/// Restore a snapshot of the state machine.
fn restore_snapshot(&mut self, snapshot: Vec<u8>) -> ();

State Machine API

Building
_______
Together

Road to 1.0

  • Membership Changes
  • Log Compaction
  • Snapshotting
  • Testing of common failure situations
  • Client Robustness

Future Fun

  • Verified communication
  • C Bindings
  • New libraries to use
  • Intensive failure testing

1.0 should represent a full implementation of Diego's Paper.

Get Involved

Help us:

  • Define it!
  • Abstract it!
  • Test it!
  • Break it!
  • Fix it!
  • Build off of it!

Find us on:

Made with Slides.com