Monoids
Functional Programming Ideas and Patterns
This subject is Difficult to approach
- The vocabulary of functional programming can be dense and confusing. (things you don't know are defined in terms of other things you don't know)
- The applicability of these abstract concepts is often unstated.
- Difficult to google things, what does this mean? ⊕
So just academic? Or why do we care?
- There are many foundational patterns that are used throughout functional programming
- Familiarity allows you to identify these patterns in other code and understand it
- You can incorporate these to create clean functional code
Let's look at some Functions
- (+ 1 1) => 2
- (str "Hello" " world") => "Hello world"
- (concat ['a 'b 'c] ['d 'e 'f]) => ['a 'b 'c 'd 'e 'f]
Let's consider the types here:
- (+ int int) => int
- (str string string) => string
- (concat vector vector) => vector
This is the first requirement for monoids
- A monoid takes two items of type x and returns type x
- In math this is called a "closure" which is super confusing because of how that word means something different in CS (closure doesn't need 2)
- (/ 5 2) => 2.5 (/ int int) => float ~fails~
- (and true false) => false (and bool bool) => bool ~closure~
- (= 2 3) => false (= int int) => bool ~fails~
- (count [:a :b]) => 2 (count vector) => int ~fails~
- (concat [:a "fo" :b] [:c 4 2]) (concat vector vector) => vector ~closure~
Why do we care about closures?
You can chain things together easily!
(you've probably been doing this for ages!)
For strings:
(str host ":" port "/" resource)
For "fluent" apis:
myService.enableAuth().startService().debug()
For flow control:
if (x == true and y == true) or .....
More to consider...
(+ 1 0) => 1
(concat [1 2] []) => [1 2]
(str "Hello" "") => "Hello"
Consider in general...
(+ some-int nothing) => some-int
(concat some-list empty-list) => some-list
(str some-string empty-string) => some-string
Second Requirement of Monoids
We need to have a concept of nothing!
Or stated more like a math person, 'we need something to give our function, in addition to an "actual" argument, that will always give back the same "actual" argument' (called identity)
- (and x true) => x ~identity~
- (intersection set1 set2) => set3 ~fails~ (universal set doesn't count)
- (* 1 x) => x ~identity~
Why do we care about Identity?
It solves a lot of irritating little problems.
Problem: find the max value in a collection.
;; naive imperative approach
;; (but that we totally have all seen in the wild)
(defn find-max-int [col]
(loop [max-value (Integer/MIN_VALUE)
[next-int & remaining] col]
(if (nil? next-int)
max-value
(if (> next-int max-value)
(recur next-int remaining)
(recur max-value remaining)))))
;; this works only because clj already
;; understands that addition has an identity of zero
(reduce + [1])
Last thing to consider
(= (+ 1 2 3)
(+ 1 (+ 2 3))) => true
(= (concat (concat [1 2] [3 4]) [5 6])
(concat [1 2] (concat [3 4] [5 6])))
(= (max 1 2 3)
(max (max 1 2) 3))
Third requirement of Monoids
We can group things however we want (associativity)
( = (* 2 (* 3 4))
(* 2 3 4)) ~associative~
(not= (/ 2 (/ 3 4))
(/ 2 3 4)) ~fails~
Why do we care about Associativity?
- Parallelization
- Divide and conquer algorithms
- Incremental algorithms
Consider counting words in a book
- make it smaller, count one page at a time
- make it parallel, you count page 1, I count page two
- make it incremental, I counted pages 1-5 today, tomorrow I can count page 6 and not recount 1-5
~~ALMOST THERE~~
This is never listed as a formal rule, but it's still a true thing, monoids operate on two arguments, although it's really impossible to satisfy identity without this being true, but it warrants a glance.
(+ int int) ~closed~ ~associative~ ~identity~ ~2 args~
(inc int) ~closed~ ~not associative~ ~no identity~
Think about Monoids when you're thinking about collections.
Monoids are often a combinatory pattern.
That's what a Monoid is
If we have a function fn that accepts 2 arguments of type x
1) fn in closed over x (fn x x) => x (closure)
2) There is some value x1 that (fn x1 _) => _ (identity)
3) We can group this together as we choose
(= (fn x1 (fn x2 x3))
(fn (fn x1 x2) x3)) => true (associativity)
~~~So what! You taught me some math and told me I already know it!
Map-Reduce
Let's start with reduce, what is reduce? (doc reduce)
clojure.core/reduce
([f coll] [f val coll])
f is a "reducing function", it takes 2 arguments and does ~something~ to them. coll is a collection, items are taken one at a time and supplied to f along with the prior evaluation of f (or an initial starting value val).
(reduce + [1 2 3 4])
(+ (+ (+ 1 2) 3) 4)
(reduce + 0 [1 2 3 4])
(+ (+ (+ (+ 0 1) 2) 3) 4)
Map-Reduce : 2
At this point our monoid radar is going crazy!
(+ (+ (+ 1 2) 3 ) 4)
We see that these are chaining together safely! (closure)
But maybe reduce actually does this:
(+ (+ 1 2) (+ 3 4))
We don't care (associativity) let the language figure it out.
Map-Reduce : 3
But wait, there's more. What if
(+ (+ 1 2) (+ 3 4))
Blue is an operation handled by node 0
Red is an operation handled by node 1
Then we aggregate on node 0
More parallelism thanks to associativity.
And when we have
(+ 1 2 3 4 5) then what?
(+ (+ 1 2) (+ 3 4) (+ 5 ..))
Identity!
Map-Reduce : 4
This is awesome but I don't live in fantasy land of ints, I live in horror land of java interop. I have a collection of:
class CompositeProxyFactoryBeanImpl
public int composedProxyFactoryBeans()
Here Map helps us!
(map #(.composedProxyFactoryBeans %)
collection-CompositeProxyFactoryBeanImpl)
Map-Reduce : 5
The summary of the pattern:
map the collection to a monoid
reduce the collection
Parting Thought
If we want to combine 2 functions, we can do this:
(comp fn1 fn2)
If we want to comp fn1 to fn2 and always get back fn2
we can use identity
(comp identity fn2) => fn2
Can we use comp/identity as a monoid?
Does it work for any functions or just some?
What does it look like in our code when we do this?
What patterns do we notice?
Fn concepts: Monoids
By Philip Doctor
Fn concepts: Monoids
- 1,498