lunchtime algorithms

Session One, 19 April 2016

what's an algorithm?

A list of steps for a computer to solve a given problem

There are many kinds of algorithms

Take the following problem:

Given a list of numbers, find the pair of numbers that, when multiplied together, return the largest product

the naïve algorithm

Text

numbers = [2, 5, 1, 10, 12]

result = 0

for i in range(0, n):
    for j in range(i+1, n):
        if a[i]*a[j] > result:
            result = a[i]*a[j]


# 2*5 > 0   => result = 10
# 2*1 > 10  => result = 10
# 2*10 > 10 => result = 20
# and so on...

drawbacks

extremely slow! Why?
- for an array of size 5, involves 25 calculations
- won't scale
- very memory-intensive
- duplication of effort: 5 * 2 and 2 * 5
- redundant -- continues crunching even when the largest number has been found
most modern computers can execute 10^9 basic operations per second

Can we be smarter about this?

a basic optimisation

Insight: Recognise that the greatest product is also the product of the two greatest numbers in the list

def fastPairwiseProduct(a):
    n = len(a)

    ultimate_index = -1
    for i in range(0, n):
        if ultimate_index == -1 or a[i] > a[ultimate_index]:
            ultimate_index = i

    penultimate_index = -1
    for j in range(0, n):
        if (j != ultimate_index) and (penultimate_index == -1 or a[j] > a[penultimate_index]):
            penultimate_index = j

    return a[ultimate_index] * a[penultimate_index]

Standard optimisation techniques generally find and eliminate duplication of effort and cut down on memory usage

But, naïve algorithms are still useful!

Stress testing

Generate large, random datasets
Naïve algorithm as the base case
Compare output of base case to proposed optimisation
If solutions differ, there is a problem with the optimised case, or the base case, or both 😱

# a scrappy example of stress-testing

while(True):
    numbers = []
    random_length = randint(2,100)
    for num in range(random_length):
        numbers.append(randint(0,10000000))

    outcome1 = slowPairwiseProduct(numbers)
    outcome2 = fastPairwiseProduct(numbers)

    if outcome1 != outcome2:
        print("Error: solutions don't match!")
        print(outcome1)
        print(outcome2)
        break;
    else:
        print("OK ---- " + str(outcome1))

Verifying algorithms

Limited manual testing
Try to generate different answers
- Smallest/largest possible outputs
- Null/undefined/divide by 0 errors
Understand time/memory consumption with increasing dataset size
Corner cases -- "degenerate cases", i.e. wrong datatypes as inputs

some things I learnt

To appreciate algorithms built into modern languages, you have to start writing very imperative code -- no more high-level functions like sort, reduce, filter, etc. 😭
Maths is amazingly useful but you don't need a deep understanding to start writing algorithms. You will spend more time learning the standard optimisation tools
Python is not the worst language ever (C is)

discussion? 🍔🍱🍩☕️

Algo Brownbags #1

By Denise Yu

Algo Brownbags #1

1,746

Denise Yu

I'm a software engineer. You can usually find me at the local pub, bouldering, or hunting for the best Korean fried chicken in London.

lunchtime algorithms

the naïve algorithm

drawbacks

a basic optimisation

Stress testing

Verifying algorithms

some things I learnt

discussion? 🍔🍱🍩☕️

Algo Brownbags #1

More from Denise Yu