Caching

priyatam mudivarti
principal engineer, architect
level money platform, capital one

half a billion

USER TRANSACTIONS

1 million downloads

iOS, Android, Web (on-boarding only)

8GB of daily Cache

a lot of data

32k Clojure LOC, ~1k Cljs LOC
3.5k Python LOC for tests
~400 million user transactions

Level App + platform

User transaction is a single amount (credit/debit) from your bank, credit card, etc.,

Programs exhibit spatial & temporal locality

application caches focus on
temporal locality

If you access data x at time t, you are likely to access data x at time t +/- delta

Cache

t1

tn

level

Naming

EviCtion

INVALIDATION

There must be a proper way to get informed that the information held by a cache needs to be discarded.

who do I tell?

Same user requests the same information multiple times, or multiple users request the same information.

EVICTION

How long do I exist?

Most algorithms you heard about are on evictions

DecoupLE them

EVICTION & INVALIDATION

why are they coupled?

Eviction may not exist without invalidation*

Invalidation has a property of being able to evict. It exists as a function of eviction

Memcache knows how to evict stale objects.

Let it do its job.

the question is not what time a function is called, but how invalidation requests are relative to time

*unless the cache runs out of memory

Naming

{cache-name}-{uid}-{cache-version}-{logical-timestamp}

auto-invalidation comes for free, whenever we bump the cache-version or logical-timestamps

immutable CACHE KEYS

{cache-name}-{uid}-{cache-version}-{logical-timestamp}

Naming

EviCtion

INVALIDATION

local.cache

remote.cache

generational.cache

Level

Chapters

Application cache

Level appliction cache is a distributed generational cache, built on Memcached and core.cache

Name - description of what we’re caching

Dependencies: Other caches it relies on (hierarchical)

Compute - a function invoked to fill the cache.

Serialize/deserialize fns - you know what this means

Version - increments when “compute” semantics change: to prevent reading data computed by the old function

LEVEL Cache


(def level-cache
  {:name "suggested-new-funds"
   :version 7
   :dependencies #{:suggested-budget-update}
   :compute (fn [uid now computed]
              (bf/make
               bf/funds-holder
               {:funds (filter (complement :fund-id)
                               (:funds (:suggested-budget-update computed)))}))
   :serialize bf/pb->bytebuffer
   :deserialize #(bf/decode bf/funds-holder %)})

Users' transactions

Funds

Predicted transactions

Balance projection (derived from predicted transactions)

Suggested funds & buckets (pattern detection)

Buckets, including matched transactions

Budgets, synthetic blobs like Funds & Spendable

OUR ApplicAtion CAche

texans

base funds

monthly texans
projection

all funds

suggested budget update

insights

texans

projection

transactions

projection

balance

projection

LEVEL Cache - Dependencies

(def cache-order [:base-funds
                  :raw-insight-rows
                  :stripped-texans
                  :multi-month-texans-projection
                  :texans-projection
                  :transactions-projection
                  :full-funds
                  :balance-projection
                  :suggested-budget-update
                  :suggested-funds
                  :suggested-new-funds
                  :insights
                  :cards
                  :budget-snapshot])

Dependency graph is a DAG—an ordered list.

No branches.

(def cache-order [:full-funds
                  :balance-projection
                  :suggested-budget-update
                  :suggested-funds
                  :suggested-new-funds
                  :insights
                  :cards
                  :budget-snapshot])

Invalidate full funds?

Invalidation is simple: invalidate rest

Hierarchical cache.

Caches can have as many dependencies as you want

Pass current transactions to predict their future transactions. Make the future transactions cache depend on the current transactions one - conceptually it queries that cache

Dependency Loops are a compile-time error

On dependencies


(defn validate
  "Validates cache config and dependencies."
  [cachez]
  (when-not (= (set cache-order) (set (keys cachez)))
    (throw (Exception. (str "Inconsistent cache config: " cache-order " and " (keys cachez)))))
  (doseq [cache-name cache-order]
    (doseq [dependency (:dependencies (cachez cache-name))]
      (when-not (cachez dependency)
        (throw (Exception. (str "Inconsistent cache config: Unrecognized dependency " 
          dependency " for " cache-name))))
      (when-not (> (cache-offsets cache-name) (cache-offsets dependency))
        (throw (Exception. (str "Dependency " dependency " is calculated 
          after dependent cache" cache-name)))))))

@platform

#dynamo

LEVEL CachE

cache logic

API

Queues

query returns nil? recompute

incrementing the timestamp means there won’t be anything in Memcache with that name, so we’ll get a cache miss

queues ensure cache is "filled", invalidate every 24 hrs

atomically query the logical timestamps

logical timestamps start at 1, get atomically incremented to invalidate the cache

An immutable key tied to how values in the cache are produced.

{cache-name}-{uid}-{cache-version}-{logical-timestamp}

prevents race conditions, maybe

A Queue-backed service gets a message whenever we invalidate the cache, and refills the cache

Each key also is associated with a “generation” (“user-logical-timestamps” ). To invalidate we simply increment the cache generation. Anyone still using an old generation still has access to it.

RACE CONDITIONS

@platform

#dynamo

#memcache

REMOTE Cache (memcache)

API

Queues

cache logic

cache component

Memcache LRU eviction strategy removes invalid data or TTL after 24 hrs kicks in

delegate to

Gzip + Protocol Buffers

atomically query the logical timestamps

Often caches don't know how often they’re full

A Queue-backed service gets a message whenever we invalidate the cache, and refills the cache

New transactions coming in (including pending) will result in the transactions cache being invalidated, then queried, so it ends up full

When is it full?

If they are slow and being called from the API

Don't add if they’re only called occasionally from the backend, as we’ll be wasting CPU cycles keeping the cache filled

When should I add?

Serialization and deserialization is slow

MEM CACHE - Problems

Gzipping 1MB payload takes 100ms

Nippy is faster, but still ...

Core cache

CacheProtocol

Nice implementations of common (and uncommon) eviction strategies: FIFO, TTL, LRU, LIRS

Many implementations by third party libraries, including Spycache clojure lib.

Composable


(defprotocol CacheProtocol
  "This is the protocol describing the basic cache capability."
  (lookup [cache e]
          [cache e not-found]
   "Retrieve the value associated with `e` if it exists, else `nil` in
   the 2-arg case.  Retrieve the value associated with `e` if it exists,
   else `not-found` in the 3-arg case.")
  (has?    [cache e]
   "Checks if the cache contains a value associated with `e`")
  (hit     [cache e]
   "Is meant to be called if the cache is determined to contain a value
   associated with `e`")
  (miss    [cache e ret]
   "Is meant to be called if the cache is determined to **not** contain a
   value associated with `e`")
  (evict  [cache e]
   "Removes an entry from the cache")
  (seed    [cache base]
   "Is used to signal that the cache should be created with a seed.
   The contract is that said cache should return an instance of its
   own type."))

@platform

#dynamo

#memcache

Local Cache (Core.Cache)

API

Queues

cache logic

cache component

core
cache

Write-through/Read-through

Delays

memoize DONE RIGHT

http://kotka.de/blog/2010/03/memoize_done_right.html

Clojure does not provide ready-made solutions, but it provides a lot of tools. The art is to combine them correctly. And when you are done, you end up with something small and elegant.

— Meikel Brandmeyer

public Delay(IFn fn){
    this.fn = fn;
    this.val = null;
    this.exception = null;
}

synchronized public Object deref() {
	if(fn != null) {
		try {
		    val = fn.invoke();
		}
		catch(Throwable t) {
			exception = t;
		}
		fn = null;
	}
	if(exception != null)
		throw Util.sneakyThrow(exception);
	return val;
}

Delay => Future => Promise

delay is guaranteed to run only once, and caches the results the remaining times

(ns cache.core)

(defn query-cache
  "Top level query api for application access."
  [uid timestamp tz cache-name params]
  (if-let [cache (caches cache-name)]
     (let [logical-ts   (core/get-user-logical-timestamps cache-name uid)
           base-key     [cache-name uid (:version cache) params logical-ts]
           query-dag    (fn [dep-name]
                         (let [default-params (:default-params (caches dep-name))]
                           [dep-name (query-cache dep-name
                                                  uid
                                                  timestamp
                                                  tz
                                                  (default-params uid timestamp))]))

           gencache!    (swap! cache-component/instance
                               cache-component/query-through base-key delayed-val!
                              (:serialize cache)
                              (:deserialize cache))]

           delayed-val! (delay
                          (let [deps (into {} (map query-dag (:dependencies cache)))]
                            (recalculate! cache-name uid params timestamp deps)))

the atom's result holds a delayed value ... derefed elsewhere

      @(:result gencache!))
    (throw (Exception. (str "Querying unknown cache:" cache-name "for" uid)))))

delay recalculate! which calls the compute function of given cache

get the logical timestamp and recursively compute deps and pass to others

return the swapped value in the atom

      (let [memcache-delay (delay
                            (if-let [memcache-value (try (core/item->value 
                                                           (core-cache/lookup memcache key) 
                                                           deserialize-fn)
                                                         (catch Exception e nil))]
                              memcache-value

(ns cache.component)

(def instance (atom nil))

(defn query-through
  [{:keys [memcache localcache] :as gencache} base-key value-delay serialize-fn deserialize-fn]
  (let [key (apply core/stringify-key  base-key )]
    (if-let [local-delay (core-cache/lookup localcache key)]
     (merge gencache
            {:localcache (core-cache/hit localcache key)
             :result local-delay}))

        (merge gencache
               {:localcache (core-cache/miss localcache key memcache-delay)
                :result memcache-delay})))))

first, look at local cache, if found cach-hit ... put it in :result

                              (do
                                (core-cache/miss memcache key
                                                 (core/value->item @value-delay serialize-fn))
                                @value-delay)))]

then, look at mem cache and return its delayed value

if not, cache-miss and deref delayed value

finally, return memcache delay

:result => memcache-delay => recompute value delay

delay the recalculcuation

@platform

#dynamo

#memcache

A genertion cache, with local and remote providers

API

Queues

cache logic

cache component

core
cache

One physical cache supports many logical caches

That queue shouldn’t back up—compute faster or provision more machines

Most caches don’t actually need to invalidate every day

Cache misses can be slow —profiling could help

Things to remember

Performance & Metrics

memcache
node

local
cache

load balancer

~1-2 ms

~100x

Network latencies on PROD

local
cache

~3000x

median times

dynamo

Memcache uses consistent hashing, however ...

Watch your Dashboards

Metrics

http://metrics.dropwizard.io

simple wrappers in clojure

cache-component wraps metrics for each miss, hit, and query

timers can be logged via cache-component

no-cache = ~5s - 25s

memcache = ~500ms

local-cache = ~

p99

Clojure Performance

profile compute functions

understand clojure seqs are slow

try to sort first

make good use of rseq in filters

when in doubt, read the clojure source


(defn recent-bucketed? [tz new-card cards texan]
  (first-filter #(and (recent-txn? tz % texan)
                      (= :bucketed-transactions (:card-type %))
                      (= (:bucket-ids (:bucketed-transactions %))
                         (:bucket-ids (:bucketed-transactions new-card)))
                      (every? (comp #{:legacy-insight :debit-insight :credit-insight}
                                    :bucket-type)
                              (:buckets (meta new-card))))
                cards))


(defn- recent-cards
  "Optimized version of recent cards. Reverses the cards internally, and works
   iff cards are ordered from oldest to newest."
  [utz texan cards]
  (take-while (partial recent-card? utz texan) (rseq cards)))

(defn recent-bucketed? [utz new-card cards texan]
  (first-filter #(and (= :bucketed-transactions (:card-type %))
                      (= (:bucket-ids (:bucketed-transactions %))
                         (:bucket-ids (:bucketed-transactions new-card)))
                      (every? (comp #{:legacy-insight :debit-insight :credit-insight}
                                    :bucket-type)
                              (:buckets (meta new-card))))
                (recent-cards utz texan cards)))



;; create-cards is a reducer over all cards (of transactions)
;; create-cards => bundle-cards => recent-bucketed?

Protocol

it's just a single function



(defprotocol GenCache
  (query [uid timestamp timezone cache-name]
    "For a given user id, at a given instant, query a cache-name"))

Hello, again

3 files

~600 lines

1 protocol

decoupled compute, serialize, deserialize

component with local and remote providers

generational cache

Adding a new cache entry takes a few hours ... less than a day

The future

Expand DAG with Component pattern

Each cache is a component

Compose Caches

Optimize Load Balancer — Load Cache

Async computation?

Dave Fayram

Gregor Stocks

Gregory Sizemore

Jeff Shecter

Jonathan Polin

Noodles

Priyatam Mudivarti

Director, Chief Architect

Backend Lead

Senior Performance Engineer

Data Science Lead

Software engineer

Github PR Reviewer*

Principal Engineer, Architect

Thanks!

- Level Money Platform

Big thanks to ...

References

- http://kotka.de/blog/2010/03/memoize_done_right.html
- https://github.com/clojure/core.cache
- https://www.adayinthelifeof.nl/2011/02/06/memcache-internals/
- https://www.mikeperham.com/2009/01/14/consistent-hashing-in-memcache-client/-
- https://github.com/memcached/memcached/blob/master/doc/protocol.txt

https://slides.com/priyatam/caching-half-a-billion-user-txns

Caching half a billion user transactions

By Priyatam Mudivarti

Caching half a billion user transactions

While many robust Cache libraries exists, understanding cache invalidation and eviction in a distributed platform requires a deeper understanding of your data access patterns. At Level Money we built our own caching solution, backed by a Memcached Farm. In this talk I will share our story: how we name and version our caches, grow dependency graphs, find expensive computations, avoid race conditions, and define compute functions to serialize financial data using a simple Protocol.

1,232

Priyatam Mudivarti

Priyatam

Caching

half a billion

USER TRANSACTIONS

Level App + platform

Programs exhibit spatial & temporal locality

Naming

EviCtion

INVALIDATION

INVALIDATION

EVICTION

DecoupLE them

EVICTION & INVALIDATION

Naming

immutable CACHE KEYS

Naming

EviCtion

INVALIDATION

Level

Chapters

Application cache

LEVEL Cache

OUR ApplicAtion CAche

LEVEL Cache - Dependencies

Dependency graph is a DAG—an ordered list.

No branches.

On dependencies

LEVEL CachE

RACE CONDITIONS

REMOTE Cache (memcache)

When is it full?

When should I add?

MEM CACHE - Problems

Core cache

Local Cache (Core.Cache)

Delays

memoize DONE RIGHT

Delay => Future => Promise

A genertion cache, with local and remote providers

Things to remember

Performance & Metrics

Network latencies on PROD

Watch your Dashboards

Metrics

Clojure Performance

Protocol

Hello, again

3 files

~600 lines

1 protocol

decoupled compute, serialize, deserialize

component with local and remote providers

generational cache

The future

Thanks!

References

Caching half a billion user transactions

More from Priyatam Mudivarti