Monoids In Python

THe Difficulty With Math


  • There is a dense language surrounding math so a single search for a term can lead to many follow up searches (ex: functor & homomorphism)
  •  Many symbols defy a simple google search  ⊕
  • The application of what you're reading to CS is often unstated and unclear
  • I can't fix that for all of math, but I can try to elucidate this for monoids

Let's Look at Some Functions

  • 1 + 1 = 2
  • "Hello " + "World" = "Hello World"
  • shopping_cart_items.add_items(other_shopping_cart) = more_shopping_cart_items 


Let's think about the Types here

  • int + int = int
  • string + string = string
  • shopping_cart  (add_items) shopping_cart = shopping_cart

First requirement of Monoids

A monoid takes type X and returns type X!

(fancy math word is closure, because we really needed another thing in CS called closure....)

  • 5 / 2 = 2.5 (int / int = float, fails)
  • true AND false = false (bool AND bool = bool, closure)
  • 2 == 3 = false (int == int = bool, fails)
  • shopping_cart.calculateShipping() = 15.23 (shopping_cart calculateShipping = float, fails)
  • [1, 2, "foo"] + ["bar", 3, 4] = [1, 2, "foo", "bar", 3, 4] (list + list = list, closure)

Why do we care about closure?

You can chain things together!

(you've probably been doing this for a long time)

For strings!

connection_string = host + ":" + port + "/" resource_path


For fluent APIs!

email_to_send.set_to(['foo@foo.com']).set_from('bar@bar.com')


For IF control flow!

if x == true or y == true: ...

Let's look at some more equations!


  • 1 + 0 = 1
  • [1, 2] + [] = [1, 2]
  • shopping_cart.add_items(empty_cart) = shopping_cart


Think about the general rule

  • Int + nothing = the same int
  • List + nothing = the same list
  • shopping_cart add_items nothing = the same shopping_cart

Second requirement of Monoids

We need to have a concept of nothing!

Or stated more like a math person, 'we need something to give our function, in addition to a "real" argument, that will always give back the same "real" argument' (called identity)


  •  x AND true = x (identity)
  • set1 & set2 = set3 (there is no identity for set intersection, fails)
  • 1 * x = x (identity)

Why do we care about identity?

A lot of annoying problems become easy

(you've probably been doing this already)

  • Problem: given a list of bools, are they all true?

x = [True, True, False]
def all_true(bool_list):
    running_value = STARTING_VALUE
    for cur_bool in bool_list:
        running_value = running_value and cur_bool
    return running_value
print all_true(x)

Last set of equations to consider!

  • 1 + (2 + 3) = 1 + 2 + 3
  • [1, 2, 3] + ([4, 5] + [6, 7]) = [1, 2, 3] + [4, 5] + [6, 7]
  • cart_1.add_items(cart_2.add_items(cart_3)) =

inter_cart = cart_1.add_items(cart_2)

inter_cart.add_items(cart_3)



Third requirement of Monoids

We can group the order however we want (associativity)

  • 2 * 3 * 4 = 2 * (3 * 4)     associative!
  • 2 / 3 / 4 != 2 / (3 / 4)    fails!

Why do we care about Associativity?

Huge jobs can become many small jobs

(you probably already do this)

  • Count words in a book

- make it smaller, count one page at a time

- make it parallel, you count page 1, I count page two

- make it incremental, I counted pages 1-5 today, tomorrow I can count page 6 and not recount 1-5

That's what a monoid is

  1. Closure, i.e. takes type X, returns type X
  2. Identity, i.e. has something like a "zero"
  3. Associativity, i.e. I can group the function however I want


                     ~A monoid is both a function and a type~



**So much talk about things I already do Phil! Excite me!**

MAP-Reduce

What is reduce?

Reduce is like an aggregator, it takes a function, an iterable, and then accumulates the iterable using the function, example:

reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])

Many implementations look like this ((((1+2)+3)+4)+5)


So if we have the problem "Sum all prices in our shopping cart", we wonder, "Can we use reduce function and think with monoids?!"

Map-Reduce (2)

shopping_cart = [1.12, 5.23, 9.99, 62.11, 12.12]

Rule #1 (closure) We can chain safely!

1.12 + 5.23 + 9.99 + 62.11 + 12.12

Rule #2  (identity)

We don't need it yet

Rule #3 (associativity)

((((1.12 + 5.23) + 9.99) + 62.11) + 12.12)

So we know we can use reduce here, because we satisfy our monoid properties.

Map-Reduce (3)

But wait, there's more!

We've told our library we want to reduce, not *how* to reduce, what if 1.12 + 5.23 + 9.99 + 62.11 + 12.12 could be run on many servers?

#3 1.12 + 5.23 + 9.99 + 62.11 + 12.12 = (1.12 + 5.23) + (9.99 + 62.11) + 12.12

Server A: 1.12 + 5. 23

Server B: 9.99 + 62.11

Server C: 12.12 + 0 <-- Rule #2 helps us :)

Map-Reduce (4)

That's great but I have a list of shopping cart items, not floats!

Map function is our friend here, it takes each object in our collection and runs a function on it.  We can use it to TRANSFORM OUR BORING OBJECTS INTO MONOIDS!!!!!

class shopping_item(object):
    def __init__(self, price):
        self.price = price

cart = [shopping_item(1.12), shopping_item(5.23), shopping_item(9.99)]
cart_float = map(lambda shop_item: shop_item.price, cart)

total = reduce(lambda x, y: x + y, cart_float)

Parsing

special thanks


My coworker Matthew Wampler-Doty who was so excited that I was going to give a talk on monoids in python, that he started e-mailing me examples he wrote.

Made with Slides.com