Monoids In Python
THe Difficulty With Math
- There is a dense language surrounding math so a single search for a term can lead to many follow up searches (ex: functor & homomorphism)
- Many symbols defy a simple google search ⊕
- The application of what you're reading to CS is often unstated and unclear
- I can't fix that for all of math, but I can try to elucidate this for monoids
Let's Look at Some Functions
- 1 + 1 = 2
- "Hello " + "World" = "Hello World"
- shopping_cart_items.add_items(other_shopping_cart) = more_shopping_cart_items
Let's think about the Types here
- int + int = int
- string + string = string
- shopping_cart (add_items) shopping_cart = shopping_cart
First requirement of Monoids
A monoid takes type X and returns type X!
(fancy math word is closure, because we really needed another thing in CS called closure....)
- 5 / 2 = 2.5 (int / int = float, fails)
- true AND false = false (bool AND bool = bool, closure)
- 2 == 3 = false (int == int = bool, fails)
- shopping_cart.calculateShipping() = 15.23 (shopping_cart calculateShipping = float, fails)
- [1, 2, "foo"] + ["bar", 3, 4] = [1, 2, "foo", "bar", 3, 4] (list + list = list, closure)
Why do we care about closure?
You can chain things together!
(you've probably been doing this for a long time)
For strings!
connection_string = host + ":" + port + "/" resource_path
For fluent APIs!
email_to_send.set_to(['foo@foo.com']).set_from('bar@bar.com')
For IF control flow!
if x == true or y == true: ...
Let's look at some more equations!
- 1 + 0 = 1
- [1, 2] + [] = [1, 2]
- shopping_cart.add_items(empty_cart) = shopping_cart
Think about the general rule
- Int + nothing = the same int
- List + nothing = the same list
- shopping_cart add_items nothing = the same shopping_cart
Second requirement of Monoids
We need to have a concept of nothing!
Or stated more like a math person, 'we need something to give our function, in addition to a "real" argument, that will always give back the same "real" argument' (called identity)
- x AND true = x (identity)
- set1 & set2 = set3 (there is no identity for set intersection, fails)
- 1 * x = x (identity)
Why do we care about identity?
A lot of annoying problems become easy
(you've probably been doing this already)
- Problem: given a list of bools, are they all true?
x = [True, True, False]
def all_true(bool_list):
running_value = STARTING_VALUE
for cur_bool in bool_list:
running_value = running_value and cur_bool
return running_value
print all_true(x)
Last set of equations to consider!
- 1 + (2 + 3) = 1 + 2 + 3
- [1, 2, 3] + ([4, 5] + [6, 7]) = [1, 2, 3] + [4, 5] + [6, 7]
- cart_1.add_items(cart_2.add_items(cart_3)) =
inter_cart = cart_1.add_items(cart_2)
inter_cart.add_items(cart_3)
Third requirement of Monoids
We can group the order however we want (associativity)
- 2 * 3 * 4 = 2 * (3 * 4) associative!
- 2 / 3 / 4 != 2 / (3 / 4) fails!
Why do we care about Associativity?
Huge jobs can become many small jobs
(you probably already do this)
- Count words in a book
- make it smaller, count one page at a time
- make it parallel, you count page 1, I count page two
- make it incremental, I counted pages 1-5 today, tomorrow I can count page 6 and not recount 1-5
That's what a monoid is
- Closure, i.e. takes type X, returns type X
- Identity, i.e. has something like a "zero"
- Associativity, i.e. I can group the function however I want
~A monoid is both a function and a type~
**So much talk about things I already do Phil! Excite me!**
MAP-Reduce
What is reduce?
Reduce is like an aggregator, it takes a function, an iterable, and then accumulates the iterable using the function, example:
reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
Many implementations look like this ((((1+2)+3)+4)+5)
So if we have the problem "Sum all prices in our shopping cart", we wonder, "Can we use reduce function and think with monoids?!"
Map-Reduce (2)
shopping_cart = [1.12, 5.23, 9.99, 62.11, 12.12]
Rule #1 (closure) We can chain safely!
1.12 + 5.23 + 9.99 + 62.11 + 12.12
Rule #2 (identity)
We don't need it yet
Rule #3 (associativity)
((((1.12 + 5.23) + 9.99) + 62.11) + 12.12)
So we know we can use reduce here, because we satisfy our monoid properties.
Map-Reduce (3)
But wait, there's more!
We've told our library we want to reduce, not *how* to reduce, what if 1.12 + 5.23 + 9.99 + 62.11 + 12.12 could be run on many servers?
#3 1.12 + 5.23 + 9.99 + 62.11 + 12.12 = (1.12 + 5.23) + (9.99 + 62.11) + 12.12
Server A: 1.12 + 5. 23
Server B: 9.99 + 62.11
Server C: 12.12 + 0 <-- Rule #2 helps us :)
Map-Reduce (4)
That's great but I have a list of shopping cart items, not floats!
Map function is our friend here, it takes each object in our collection and runs a function on it. We can use it to TRANSFORM OUR BORING OBJECTS INTO MONOIDS!!!!!
class shopping_item(object):
def __init__(self, price):
self.price = price
cart = [shopping_item(1.12), shopping_item(5.23), shopping_item(9.99)]
cart_float = map(lambda shop_item: shop_item.price, cart)
total = reduce(lambda x, y: x + y, cart_float)
Parsing
special thanks
My coworker Matthew Wampler-Doty who was so excited that I was going to give a talk on monoids in python, that he started e-mailing me examples he wrote.
Monoids
By Philip Doctor
Monoids
- 2,134