priyatam mudivarti
we stop writing functions and start defining a process?
synchronous
complects error handling with logic
hard to pass shared state
structure of input is often desirable
... too many functions!
If you can express your process in terms of a reducing function, you are in business. — Rich Hickey
This creation of logic via composition of sequence-iterating functions complects the created logic with the machinery of sequences. — Rich Hickey
;; a lot faster than you think because transducers make it so!
(defn letter-frequency [input]
(transduce (comp (filter #(Character/isLetter %))
(map #(Character/toLowerCase %))
(map (fn [v] {v 1})))
(partial merge-with +)
input))
;; Beautiful. Also slow. This is slideware!
(def max-value
(comp first first reverse (partial sort-by last)))
(defn transync [filenames]
(let [result (map (comp max-value
letter-frequency
slurp)
filenames)]
;; No cheating. Actually do the work.
(doall result)
result))Channels are oriented towards the flow aspects of a system. It is oblivious to the nature of the threads which use it.
;; Channels play nicely with transducers!
(let [channel (chan 1 (map #(.toUpperCase %)))]
(async/onto-chan channel ["hello" "clojure" "remote"] true)
(async/go-loop [cur (<! channel)]
(println cur)
(when-let [n (<! channel)]
(recur n))))
;; Connecting 2 channels with an async process that includes
;; a transducer.
(let [input1 (chan)
output1 (chan)]
(async/pipeline 1 output1 (map inc) input1)
(async/onto-chan input1 (range 0 10))
(print-all-channel! output1))
;; Connecting 2 channels with an async process that includes
;; a transducer.
(let [input1 (chan)
output1 (chan)]
(async/pipeline 1 output1 (map inc) input1)
(async/onto-chan input1 (range 0 10))
(print-all-channel! output1))
(let [input0 (chan)
middle1 (chan)
middle2 (chan)
output3 (chan)]
(async/pipeline 1 middle1 (map inc) input0)
(async/pipeline 1 middle2 (map (partial * 10)) middle1)
(async/pipeline 1 output3
(map #(str % " zillion channel declarations!"))
middle2)
(async/onto-chan input0 (range 0 5))
(print-all-channel! output3))
(let [input (chan)]
(print-all-channel! (==> [input]
(map inc)
(map (partial * 10))
(map #(str % " zillion fewer channel decs!"))))
(async/onto-chan input (range 0 3)))introducing ==>
connecting async pipelines
in a stack, each frame
transitioning from one state to another
push original message {...}
pop original message
{...}
{:from :to}
{:from :to}
{:from :to}
async.task
async task = core.async-pipeline + (magic)
an idempotent process
10
block
drive
10 = no of concurrent threads
async.task
async.task
async.task
async.task
async
(defn mapduce
"Make a map transducer from a bare function. It extracts
a value from fromkey and inserts the value generated into tokey.
If no tokey is provided, the function will update the value
in fromkey."
([fromkey f] (mapduce fromkey fromkey f))
([fromkey tokey f]
(map (fn [m] (assoc m tokey (f (fromkey m))))))
([context fromkey tokey f]
(map (fn [m] (assoc m tokey (f (fromkey m) context))))))Make a map transducer from a bare function.
(defn process-events
[context in-chan out-chan signal-chan broadcast-chan err-chan]
(==> [in-chan err-chan]
(map (partial push-task ::original))
(mapduce-in [::original :body] [:raw-bytes-saved] parse-msg)
:block (mapduce :raw-bytes-saved :raw-bytes-loaded #(load-raw-bytes [context %]))
(add-side-effect #(log/info "LOADED RAW BYTES" (count (:raw-bytes-loaded %))))
:block (mapduce :raw-bytes-loaded :report-data-uploaded upload-report-data)
:block (mapduce :report-data-uploaded :report-validated validate-report)
(add-side-effect (partial dynamo/append))
(map (partial pop-task ::original))
:drive out-chan))a process transitioning states
Durable messages
Eventual Consistency
Atleast once, exactly once semantics
Timeout, Retention, Delay
Maps nicely to N Proceses with M channels
Visually inspect messages in queue
Logging
Monitoring
Dead Letter Queues
SDKs across languages (java sdk is async)
Provisioning
I don't want to write this!
manage processes
10
async
drive
20
drive
10
signal?
signal?
y
y
n
n
sqs
load data
validate data
process data
store datalog
load user info
validate user actions
process events
persist into db
done
done
done
sig
bro
err
out
in
in
out
err
bro
sig
block
block
async
dispatcher
process 1
process 2
async
async
autoclose sqs messages after successful
process across the stack with :drive
(defn ->scheduler-config-chan
"Create a scheduler chan based on the pre configured scheduler. Returns a channel that
responds to the 'scheduled times'."
[{:keys [broadcast-chan scheduler-config] :as context}]
(log/local "Evaluating scheduler config" scheduler-config relay-chan)
(s/validate Config scheduler-config)
(when-let [batch-type (:batch-type scheduler-config)]
(condp = batch-type
:broadcast (async/chan) ;; will be managed by app.core/broadcast
:continuous (arm/schedule
(periodic/periodic-seq (time/now) (time/seconds 1)))
:polling (arm/schedule
(periodic/periodic-seq (time/now)
(time/minutes (:poll-in-minutes scheduler-config))))
:cron (arm/schedule
(end-of-business-day (t/number->date (:schedule-at scheduler-config))))
:manual (async/chan))))
(defn schedule-task
"Start a periodic batch job based on a scheduler config."
[{:keys [db-config scheduler-config-chan signal-chan] :as context}]
(go-loop [_ (<! scheduler-config-chan)]
(start-batch context))
(when-let [next (<! scheduler-config-chan)]
(recur next))))
(let [context { ... }]
(->scheduler-config-chan context)
(schedule-task context))
works great with "atleast once" semantics in sqs!
(def PipelineMessage
{:task-type Str
:timestamp Num
:payload {:id Num
:amount Num
:uid Num}
:meta {Any Any}
(s/optional-key :errors) [PipelineError]})
(defmacro ==> [[input-channel error-channel] & raw-command-forms]
(when (empty? raw-command-forms)
(extract-command-exception ["EMPTY-START"]))
(let [grouped-command-forms (extract-commands raw-command-forms)
chan-seq-name (gensym "pipeline-channel-source")
first-drive-cmd (first (filter #(= (get-cmd-type %) :drive)
grouped-command-forms))
is-driving? (not (nil? first-drive-cmd))
has-error-channel? (not (nil? error-channel))
error-channel-name (gensym "pipeline-error-channel")]
`(let [~chan-seq-name (concat (list ~input-channel) (repeatedly async/chan))
~error-channel-name ~(or error-channel `(async/chan))]
~@(commands->pipeline-forms chan-seq-name
has-error-channel?
error-channel-name
grouped-command-forms)
~(if is-driving?
(last first-drive-cmd)
`(nth ~chan-seq-name ~(count grouped-command-forms))))))
(defn- command->pipeline-form
[channel-seq-name error-channel? error-channel-name idx cmd-forms]
(let [cmd-type (get-cmd-type cmd-forms)
paralellism (or (some (fn [x] (when (number? x) x)) cmd-forms) 1)
dispatch (case cmd-type
:xduce 'async/pipeline
:block 'async/pipeline-blocking
:drive 'armature.core/drive-helper
:async 'armature.core/pipeline-async-helper)
input-chan-form `(nth ~channel-seq-name ~idx)
output-chan-form `(nth ~channel-seq-name ~(inc idx))
cmd (last cmd-forms)]
(if error-channel?
`(~dispatch ~paralellism ~output-chan-form ~cmd ~input-chan-form true
;; This is the error handler, a bit ugly but helpful
(fn [e#] (async/go (async/>! ~error-channel-name
{:error e# :index ~idx :command (quote ~cmd)})) nil))
`(~dispatch ~paralellism ~output-chan-form ~cmd ~input-chan-form))))
(defn- commands->pipeline-forms
[channel-seq-name error-channel? error-channel-name cmd-forms-list]
(let [first-drive-cmd (first (filter #(= (get-cmd-type %) :drive)
cmd-forms-list))]
(when (and first-drive-cmd (not (= first-drive-cmd (last cmd-forms-list))))
(throw (Exception. ":drive directive makes no sense except in the tail of ==>")))
(map-indexed (partial command->pipeline-form channel-seq-name error-channel?
error-channel-name)
cmd-forms-list)))
(defn start-queue-consumer!
"Start a SQS consumer and return event, error, and finalize channels."
[connection queue-url {:keys [max-consumption-window
long-poll-duration
stop-check-fn]
:or {max-consumption-window 20
long-poll-duration 20
stop-check-fn (fn [] false)}}]
(let [^AmazonSQSAsyncClient instance (:instance connection)
msg-request (receive-msg-request queue-url long-poll-duration)
raw-result-chan (chan max-consumption-window)
error-chan (chan)
events-chan (chan (* 10 max-consumption-window))
finalizer-chan (chan (* 10 max-consumption-window))
handler (arm-aws/respond-with-channels raw-result-chan error-chan)
rescheduler (partial reschedule-consumption
instance
handler
msg-request
stop-check-fn)
channel-scrubber-xf (comp (map rescheduler)
(mapcat unbundle-message-result)
(map (partial uncrack-message queue-url)))]
;; Schedule async processing
(async/pipeline 1 events-chan channel-scrubber-xf raw-result-chan)
(arm/sink! (partial delete-message! instance) finalizer-chan error-chan)
;; Call once to "kickstart" the pipeline
(rescheduler ::nonce)
{:event-channel events-chan
:error-channel error-chan
:finalize-channel finalizer-chan}))
(defn start-queue-writer!
"Start a SQS writer and return write, error, and result channels. The incoming
message should be a map whose value is in another map with a keyword 'armature'"
[{:keys [instance] :as connection}, queue-url
{:keys [parallelism
max-consumption-window]
:or {parallelism 1
max-consumption-window 20}}]
(let [input-chan (chan max-consumption-window)
output-chan (chan)
error-chan (chan)
handler (arm-aws/respond-with-error-channel error-chan)
writer (partial write-to-queue instance queue-url handler)
writer-xf (mapduce :armature :armature-task writer)]
(async/pipeline parallelism output-chan writer-xf input-chan)
(arm/sink! identity output-chan)
{:write-channel input-chan
:error-channel error-chan
:result-channel output-chan}))component/Lifecycle
(start [self]
(if-not queue-url
(throw+ {:type ::bad-config
:message "Could not connect to queue"})
(let [global-lock (when writes-enabled?
(let [lock (zookeeper/interprocess-write-lock "/app/service-lock")]
;; block until a lock can be acquired, (we want only one consumer)
(while (not (zookeeper/acquire-lock-with-millis-timeout lock 250))
(log/info "Tried to acquire lock but failed. Trying again in 30s")
(Thread/sleep (* 30 1000))))
lock))
context {:access-key access-key
:secret-key secret-key
:queue-url queue-url
:broadcast-queue-url broadcast-queue-url
:scheduler-config scheduler-config
:consumer? consumer?
:global-lock global-lock}
{:keys [service-chan
signal-chan
broadcast-chan
scheduler-config-chan]} (reconciler/start-service! context)
full-config (assoc self
:context context
:service-chan service-chan
:signal-chan signal-chan
:broadcast-chan broadcast-chan
:scheduler-config-chan scheduler-config-chan)]
(log/info "Component is listening at" queue-url)
full-config)))
SQS
>signal
>broadcast
repl
a distributed async
pipeline stack
scheduler
process-3
process-1
process-2
component
dispatcher
SQS
each process accepts five chans: in, out, err, sig, bro
- manage channels over async pipelines
- use signals to compose pipelines
- use broadcast for external communication
- map 1 SQS to N Processes
- each process takes 5 channels: in, out, err, sig, bro
- manage errors via global error-chan
- build idempotent pipelines with unique msg id
- use side-effects vs pure async tasks
- learn stack-oriented programming!
Introduction by Rich Hickey https://www.infoq.com/presentations/clojure-core-async Timothy Baldridge's core.async walkthrough https://www.youtube.com/watch?v=enwIIGzhahw david nolen on core.async in cljs https://www.youtube.com/watch?v=AhxcGGeh5ho http://www.braveclojure.com/core-async/
Thanks, Dave Fayram for helping me build this library.
armature and robot parts images from adafruit.com
other images sourced from google image search, copyright by respective owners:
https://www.upuno.com/upuno-web2/wp-content/uploads/2015/03/lisa-L-armature-product-img-1.jpg
http://sculptures.website/wp-content/uploads/2016/12/armature-sculpture.jpg
http://electriciantraining.tpub.com/14177/img/14177_60_2.jpg
http://www.kineticarmatures.com/images/custom%20sleepy%20bear.jpg
@priyatam | priyatam.com