Bloom Filters

Steve V

The Problem

Say your app serves some data based on user keywords. And say that fetching the data is mega-slow.

 

Now what?

I know, I'll use caching!

Great! Things are looking good, you cache responses for each user and you go home early, right?

The problem is space, and time

Some time goes by, and you get a visit from your friendly neighbourhood devops guy. And he's angry. Like, find your home address angry.

What happend?

Whichever caching method you used, you might have forgotten that it also takes up space, and since cache invalidation is really hard, you're stuck trading off app performance vs space usage.

Rock

Hard place

A Solution Appears

(after much, much googling, stackoverflow and wikipedia...)

Bloom Filters!!

(you shout, and scare the crap out of your cat)

What are ya?!

Bloom filters are a data-structure that acts pretty similarly to a hashtable, except that it sometimes lies. But, it does it with waaaay less space.

 

The lies are called "false positives"

No, really

https://en.wikipedia.org/wiki/Bloom_filter

The Solution

Using bloom filters you implement caching for only those keywords that users visit frequently (or semi-frequently) and only store responses for those, or really expensive ones

FIN

Bloom Filters

By signupskm

Bloom Filters

  • 54