Backup and Restore

in Cassandra and OpsCenter

Overview

  • Snapshot Operations
  • Restore Operations
  • Commit Log Archiving/Point in Time Restore
  • Remote backup
  • From both Cassandra and Opscenter perspectives

Snapshots

Nodetool Snapshot Basics

 

Performs a flush, then hard links sstables to 

More at 

http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsSnapShot.html

org.apache.cassandra.db
  ->StorageService
    ->takeSnapshot
<data_file_directories>/<ks>/<table>/snapshots/<snapshot-name>/

Under the hood, mbeans

Snapshots in Opscenter

  • Under Services -> Backup
  • Displays backup history, allows backup and restore.
  • Advanced settings we'll cover later
  • Backup Service is an Enterprise Feature

More at

http://docs.datastax.com/en/opscenter/5.2/opsc/online_help/services/opscBackupService.html

Snapshots in Opscenter

  • Schedule repeated backups or create ad hoc backup
  • Select keyspaces
  • Set location (on server vs s3)
  • Uses the mbean to perform the snapshot rather than shelling out.
  • Coordinates the snapshot on all nodes.
  • Backs up the schema to schema.json
  • Keeps a log for audit

Auditable Records

Remote Snapshots

  • Opscenter can also backup to s3
  • Specify s3 bucket name, aws credentials
  • Optional transfer throttle and compression
  • Not all SSTables need to be backed up, because they are immutable only part of the data may require it.
  • SSTables need to be stored per node to avoid name collisions.
  • However dropping and recreating a table can lead to a naming collision as well, OPSC can attach a timestamp.
  • If your data is encrypted, make sure that the encryption key is also put somewhere safe.
  • Opsc backs up schemas
  • Topologies change over time (more on this in restore).

Restore Operations

SSTableloader Basics

  • Expects the schema to already exist for the sstables.
  • Expects a directory structure different from that created by the snapshot, specifically <Keyspace>/<Table>/<files>
  • Can stream data to other nodes, doesn't just move files into place
  • Leaves files in place as they are restored, possible disk penalty.

More at

http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsBulkloader_t.html

Restore Operations

  • Select a backup from a list of available snapshots.
  • Point in Time restores (more on this later)
  • Restore from other location

Restore Operations

  • Attempts to recreate the schema or do a schema comparison.  The latter is extremely difficult with thrift.
  • Creates symbolic links in a temporary directory to match what SSTableloader expects.
  • Logs/audit trail to follow.
  • Uses SSTableloader

Remote Restore

  • Topologies change over time.
  • When topologies shrink multiple nodes worth of data will have to be sent to a single node (sstable naming collisions).

Remote Restore

  • When topologies grow some nodes may be idle during a restore.
  • Replacement nodes will have a different host ID and will need to be matched to host ID of the snapshot.
  • Opscenter handles all of these cases.

Commit Log Archiving

  • Cassandra an execute a script when writing commit log segments
  • set in commitlog_archiving.properties

http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configLogArchive_t.html

Commit Log Archiving

  • Opscenter can enable that also under services->backups service->settings
  • Opscenter can also send these to s3 as well.

http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configLogArchive_t.html

Point in Time Restore

  • 2 step operation, restore snapshot, then replay commit logs.
  • Find the nearest snapshot that happens prior to the point in time desired, perform a restore.
  • Update commitlog_archiving.properties with the location of the commit logs as well as the point in time to restore.
  • Restart cassandra.

More At

http://docs.datastax.com/en//cassandra/2.0/cassandra/configuration/configLogArchive_t.html

PiT in Opscenter

  • OpsCenter can automate the PiT restore process
  • Set time (in UTC) OpsCenter will verify that it is capable of restoring to that point in time.
  • Commit logs or Snapshots can be local or on S3

PiT Restore Challenges

  • Commit log replays don't stream data around the ring, this makes topology changes difficult to handle.
  • Comparing schemas can be tricky if the reply contains schema changes.

Questions?

Feel free to reach out:

https://www.linkedin.com/in/philipsdoctor

Backup and Restore in Cassandra and OpsCenter

By Philip Doctor

Backup and Restore in Cassandra and OpsCenter

Backup and Restore in Cassandra and OpsCenter

  • 3,569