Resource management in Clojure

@robashton

Background

Haskell

Lazy + Purity = Great Success

Erlang

(Mostly) Strict + Impurity = Great Imperative Success Batman

Clojure

Laziness + Side Effects = ???

Types of resource

  • Network handles
  • File handles
  • Pointers (sorta)
  • Anything that uses them

Some types of resource in my database

  • Lucene indexes/writers/readers/queries/etc
  • Handles into LevelDB (iterators, transactions)

The problems

begin

Short lived handles

(defn pony-list []
    (with-open [rdr (reader "/tmp/ponies.txt")]
      (map to-pony (line-seq rdr))))



(pony-list) ;; CRASH

doall?

 

(defn pony-list []
    (with-open [rdr (reader "/tmp/ponies.txt")]
      (doall (map to-pony (line-seq rdr)))))

doall (etc)

  • Lose beneficial laziness
  • Only works for small collections
(defn read-ponies
  ([reader] (read-ponies reader (read-some-results reader)))
  ([page src]
   (cond
      ;; No more left, close reader, return empty list
      ;; Need more ponies, read ponies and recurse
      ;; cons some ponies and lazy-seq recurse

Generator function of some sort

(Reducers work too)

Generator functions

(etc)

  • Assume you're going to read to the end of the resource
  • Completely opaque to the end-user
  • An awful idea

Don't lie to the consumer

(with-ponies pony-list [f]
    (with-open [rdr (reader "/tmp/ponies.txt")]
      (f (map to-pony (line-seq rdr)))))
(defprotocol
   Java.io.Closeable
   (read-ponies [])
   (close))

Or

Native Iterators

interface NativeIterator : Iterator
{
    // Come from Iterator
    boolean hasNext();
    E next();
    void remove();

    // Because something dangerous is down there
    close();
}

iterator-seq

(defn get-ponies []
  (with-open [iter (create-iterator)]
   (iterator-seq iter)))

(get-ponies) ;; CRASH

Same as short lived handles

(defn create-iter [])
(defn next-pony [iter])
(defn close-iter [iter])

(Or a protocol)

Don't try to hide the resource

Create helper functions and leave it at that

Long lived resources

(and concurrency)

My database has..

Lucene indexes

LevelDB handles

etc

My database has...

More than one concurrent user (hopefully)

What if...

HTTP PUT /indexes/1

HTTP PUT /indexes/1

HTTP DELETE /indexes/1

HTTP DELETE /indexes/2

Solution

lock(database) {
   // Do stuff to database
}

Solution

(agents etc)

(defn create-controlled-index [i]
   (agent (create-index i))

(add-item-to-index [index item]
    (send index add-item-inner item))

(nope nope nope nope nope)

Solution - core.async

Set-up channel for native resource

(defn setup-indexes [db]
   (let [command-channel (chan)

     ;; Set up a loop with that channel
     (do-indexing-loop command-channel)

     ;; Pass back that channel to consumers for input
     (assoc db :index-commands command-channel)))


;; Provide API over that channel
(defn add-index [cc index]
  (>! cc {:cmd :add-index :data index})))

Loop over incoming commands

(defn go-index-head [_ {:keys [command-channel] :as engine}]
  (debug "being asked to start")
  (go
    (loop [state (initial-state engine)]
    (if-let [{:keys [cmd data]} (<! command-channel)]
     (do
      (debug "handling index loop command" cmd)
       (recur (case cmd
         :schedule-indexing (main-indexing-process state)
         :notify-finished-indexing (main-indexing-process-ended state)
         :new-index (add-chaser state data)
         :chaser-finished (finish-chaser state data)
         :storage-request (storage-request state data))))
      (do
        (debug "being asked to shut down")
        (wait-for-main-indexing state)
        (wait-for-chasers state)
        (close-open-indexes state))))))

Wait - I've seen that before

handle_info({ add_index, Index }, State) ->
   add_index(Index, State);

handle_info({ remove_index, Index }, State) ->
   remove_index(Index, State);

handle_info({ flush_index, Index }, State) ->
   flush_index(Index, State);

So... do it like Erlang

 

(except without supervisors, processes, useful error messages, boundaries, messaging)

Managing concurrency over native resources

Use Erlang

The end

Resource management in Clojure

By Rob Ashton

Resource management in Clojure

  • 4,383