COMP3010: Algorithm Theory and Design
Daniel Sutantyo, Department of Computing, Macquarie University
9.1 - Probabilistic Analysis
Probabilistic Analysis
9.1 - Probabilistic Analysis
- What is it?
- the use of probability theory to analyse the running time of an algorithm
- our main example for the discussion
int max = 0;
for (int i = 0; i < n; i++){
if (arr[i] > max){
max = arr[i];
}
}
9.1 - Probabilistic Analysis
int max = 0;
for (int i = 0; i < n; i++){
if (arr[i] > max){
max = arr[i];
}
}
let M = element 1
for i = 2 to n:
compare element i with M
if M is less than element i:
assign i to M
- Let us generalise the code above: we see that there are two main operations: comparison and assignment
Probabilistic Analysis
9.1 - Probabilistic Analysis
let M = element 1
for i = 2 to n:
compare element i with M // cost is c_c
if M is less than element i:
assign i to M // cost is c_a
- Let the comparison operation costs \(c_c\) and assignment operation cost \(c_a\)
Probabilistic Analysis
9.1 - Probabilistic Analysis
let M = element 1
for i = 2 to n:
compare element i with M // cost is c_c
if M is less than element i:
assign i to M // cost is c_a
- Let the comparison operation costs \(c_c\) and assignment operation cost \(c_a\)
- What is the complexity of the above algorithm?
- we have to compare all \(n\) elements for a total of \((c_c*n)\)
- do we have to do \(n\) assignments for a total cost of \((c_a*n)\)?
- we may have to do \(n\) assignments, but sometimes we may only need to do 1 assignment operation
Probabilistic Analysis
Probabilistic Analysis
9.1 - Probabilistic Analysis
let M = element 1
for i = 2 to n:
compare element i with M // cost is c_c
if M is less than element i:
assign i to M // cost is c_a
- In the worst-case scenario, we would have to do one assignment after each comparison, so the number of operations is \(n(c_c + c_a)\)
- but it is reasonable to expect that we don't need to do \(n\) assignments on an average input
- let's ignore the comparison cost \(c_c\) since we have to do it anyway
- what do you think is the average number of assignment?
- most common guess is 2 (well, actually this should be 2.5)
Probabilistic Analysis
9.1 - Probabilistic Analysis
- In order to perform a probabilistic analysis, we need to know the distribution of the input, or at least make some assumptions about it
- to do this properly, we need to understand the different types of probability distributions, which is beyond the scope of this unit
- For this example, let us assume a uniform random permutation, that is, every instance of input is equally likely
- How many different inputs are possible?
- there are \(n!\) possible permutations, and each are equally likely
- Do you think half of these permutations require 2 assignment operations?
- How many different inputs are possible?
Sample Space
9.1 - Probabilistic Analysis
- Suppose that there are only 4 elements, meaning there is a total of 4! = 24 combinations
1, 2, 3, 4
1, 2, 4, 3
1, 3, 2, 4
1, 3, 4, 2
1, 4, 2, 3
1, 4, 3, 2
2, 1, 3, 4
2, 1, 4, 3
2, 3, 1, 4
2, 3, 4, 1
2, 4, 1, 3
2, 4, 3, 1
3, 1, 2, 4
3, 1, 4, 2
3, 2, 1, 4
3, 2, 4, 1
3, 4, 1, 2
3, 4, 2, 1
4, 1, 2, 3
4, 1, 3, 2
4, 2, 1, 3
4, 2, 3, 1
4, 3, 1, 2
4, 3, 2, 1
Sample Space
9.1 - Probabilistic Analysis
- How many of these permutations require us to do 4 assignment operations?
1, 2, 3, 4
1, 2, 4, 3
1, 3, 2, 4
1, 3, 4, 2
1, 4, 2, 3
1, 4, 3, 2
2, 1, 3, 4
2, 1, 4, 3
2, 3, 1, 4
2, 3, 4, 1
2, 4, 1, 3
2, 4, 3, 1
3, 1, 2, 4
3, 1, 4, 2
3, 2, 1, 4
3, 2, 4, 1
3, 4, 1, 2
3, 4, 2, 1
4, 1, 2, 3
4, 1, 3, 2
4, 2, 1, 3
4, 2, 3, 1
4, 3, 1, 2
4, 3, 2, 1
Sample Space
9.1 - Probabilistic Analysis
- How many of these permutations require us to do 4 assignment operations?
1, 2, 3, 4
1, 2, 4, 3
1, 3, 2, 4
1, 3, 4, 2
1, 4, 2, 3
1, 4, 3, 2
2, 1, 3, 4
2, 1, 4, 3
2, 3, 1, 4
2, 3, 4, 1
2, 4, 1, 3
2, 4, 3, 1
3, 1, 2, 4
3, 1, 4, 2
3, 2, 1, 4
3, 2, 4, 1
3, 4, 1, 2
3, 4, 2, 1
4, 1, 2, 3
4, 1, 3, 2
4, 2, 1, 3
4, 2, 3, 1
4, 3, 1, 2
4, 3, 2, 1
Sample Space
9.1 - Probabilistic Analysis
- How many of these permutations require us to do 1 assignment operation?
1, 2, 3, 4
1, 2, 4, 3
1, 3, 2, 4
1, 3, 4, 2
1, 4, 2, 3
1, 4, 3, 2
2, 1, 3, 4
2, 1, 4, 3
2, 3, 1, 4
2, 3, 4, 1
2, 4, 1, 3
2, 4, 3, 1
3, 1, 2, 4
3, 1, 4, 2
3, 2, 1, 4
3, 2, 4, 1
3, 4, 1, 2
3, 4, 2, 1
4, 1, 2, 3
4, 1, 3, 2
4, 2, 1, 3
4, 2, 3, 1
4, 3, 1, 2
4, 3, 2, 1
Sample Space
9.1 - Probabilistic Analysis
- How many of these permutations require us to do 1 assignment operation?
- whenever 4 is the first element
1, 2, 3, 4
1, 2, 4, 3
1, 3, 2, 4
1, 3, 4, 2
1, 4, 2, 3
1, 4, 3, 2
2, 1, 3, 4
2, 1, 4, 3
2, 3, 1, 4
2, 3, 4, 1
2, 4, 1, 3
2, 4, 3, 1
3, 1, 2, 4
3, 1, 4, 2
3, 2, 1, 4
3, 2, 4, 1
3, 4, 1, 2
3, 4, 2, 1
4, 1, 2, 3
4, 1, 3, 2
4, 2, 1, 3
4, 2, 3, 1
4, 3, 1, 2
4, 3, 2, 1
Sample Space
9.1 - Probabilistic Analysis
- If you thought that the average number of assignments is 2 (or 2.5), do you still think so?
1, 2, 3, 4
1, 2, 4, 3
1, 3, 2, 4
1, 3, 4, 2
1, 4, 2, 3
1, 4, 3, 2
2, 1, 3, 4
2, 1, 4, 3
2, 3, 1, 4
2, 3, 4, 1
2, 4, 1, 3
2, 4, 3, 1
3, 1, 2, 4
3, 1, 4, 2
3, 2, 1, 4
3, 2, 4, 1
3, 4, 1, 2
3, 4, 2, 1
4, 1, 2, 3
4, 1, 3, 2
4, 2, 1, 3
4, 2, 3, 1
4, 3, 1, 2
4, 3, 2, 1
Sample Space
9.1 - Probabilistic Analysis
1, 2, 3, 4
1, 2, 4, 3
1, 3, 2, 4
1, 3, 4, 2
1, 4, 2, 3
1, 4, 3, 2
2, 1, 3, 4
2, 1, 4, 3
2, 3, 1, 4
2, 3, 4, 1
2, 4, 1, 3
2, 4, 3, 1
3, 1, 2, 4
3, 1, 4, 2
3, 2, 1, 4
3, 2, 4, 1
3, 4, 1, 2
3, 4, 2, 1
4, 1, 2, 3
4, 1, 3, 2
4, 2, 1, 3
4, 2, 3, 1
4, 3, 1, 2
4, 3, 2, 1
1
1
1
1
1
1
2
2
2
2
2
2
3
2
3
3
2
2
4
3
3
3
2
2
- If you thought that the average number of assignments is 2 (or 2.5), do you still think so?
- The average is 50/24 = 2.083
Sample Space
9.1 - Probabilistic Analysis
n | permutation | assignments | average |
---|---|---|---|
4 | 24 | 50 | 2.083 |
5 | 120 | 274 | 2.283 |
6 | 720 | 1764 | 2.450 |
7 | 5040 | 13,068 | 2.593 |
8 | 40320 | 109,584 | 2.718 |
9 | 362,880 | 1,026,576 | 2.829 |
10 | 3,628,800 | 10,628,640 | 2.929 |
11 | 39,916,800 | 120,543,840 | 3.020 |
12 | 479,001,600 | 1,486,442,880 | 3.103 |
Sample Space
9.1 - Probabilistic Analysis
n | permutation | assignments | average |
---|---|---|---|
8 | 40320 | 109,584 | 2.718 |
9 | 362,880 | 1,026,576 | 2.829 |
10 | 3,628,800 | 10,628,640 | 2.929 |
11 | 39,916,800 | 120,543,840 | 3.020 |
12 | 479,001,600 | 1,486,442,880 | 3.103 |
- The average number of assignments you need to do doesn't seem linear
- in fact it looks logarithmic, i.e. \(c_a\log n\)
Calculating the Average Case Complexity
9.1 - Probabilistic Analysis
- We need to do one assignment if the element we are inspecting is greater than any other element before it
- supposing that the elements are arranged randomly, if you have \(i\) elements, each one of these elements are equally likely to be the greatest element
- the probability of element \(i\) being the greatest element is \(1/i\)
- when you have one element, obviously that is the greatest one
- when you have two elements, each has 1/2 chance of being the greatest element
- when you have three elements, each has 1/3 chance, and so on
Calculating the Average Case Complexity
9.1 - Probabilistic Analysis
\[x_1, \qquad x_2, \qquad x_3, \qquad x_4, \qquad \dots\]
Expected number of assignments: 1
have to do one assignment
Calculating the Average Case Complexity
9.1 - Probabilistic Analysis
have to do one assignment if \(x_2 > x_1\), 1 in 2 chance
\[x_1, \qquad x_2, \qquad x_3, \qquad x_4, \qquad \dots\]
Expected number of assignments: 1 + 0.5
Calculating the Average Case Complexity
9.1 - Probabilistic Analysis
have to do one assignment if \(x_3 > x_1\) and \(x_3 > x_2\), 1 in 3 chance
\[x_1, \qquad x_2, \qquad x_3, \qquad x_4, \qquad \dots\]
Expected number of assignments: 1 + 0.5 + 0.33
Calculating the Average Case Complexity
9.1 - Probabilistic Analysis
have to do one assignment if \(x_4 > x_1\), \(x_4 > x_2\), and \(x_4 > x_3\), 1 in 4 chance
\[x_1, \qquad x_2, \qquad x_3, \qquad x_4, \qquad \dots\]
Expected number of assignments: 1 + 0.5 + 0.33 + 0.25
Calculating the Average Case Complexity
9.1 - Probabilistic Analysis
Expected number of assignments: 1 + 0.5 + 0.33 + 0.25
- More generally, the expected number of assignments is
\[ = \sum_{i=1}^n \frac{1}{i} \]
(this is called a harmonic series)
\[ \le \log n + 1\]
\[ 1 + \left(1 \times \frac{1}{2}\right) + \left(1 \times\frac{1}{3}\right) + \cdots + \left(1 \times \frac{1}{n}\right)\]
Calculating the Average Case Complexity
9.1 - Probabilistic Analysis
- More generally, the expected number of assignments is
\[ = \sum_{i=1}^n \frac{1}{i} \]
(this is called a harmonic series)
\[ \le \log n + 1\]
\[ 1 + \left(1 \times \frac{1}{2}\right) + \left(1 \times\frac{1}{3}\right) + \cdots + \left(1 \times \frac{1}{n}\right)\]
- Therefore the average-case complexity of the algorithm is \(O(\log n)\) which is better than the worst case of \(O(n)\)
Calculating the Average Case Complexity
9.1 - Probabilistic Analysis
- Some of you at the start may think that the average complexity of this algorithm is also \(O(n)\)
- you may have been influenced by the average-case complexity of linear search, which is also \(O(n)\), the same with its worst-case complexity
- note that in the case of linear search, each element in the array has an equal chance of being the element we are looking for, so the expected number of operation is
\[ \frac{1}{n} + \frac{2}{n} + \frac{3}{n} + \cdots + \frac{n}{n} = \frac{1}{n}\sum_{i=1}^{n} i\]
\[= \frac{1}{n} \left(\frac{n(n+1)}{2}\right) = \frac{(n+1)}{2} = O(n)\]
Closing Words
9.1 - Probabilistic Analysis
- Finding the average-case complexity of an algorithm is not a trivial task because we have to assume the distribution of the input data, and this is often beyond the scope of a computing unit
- it is something you can pick up in a STAT or MATH unit
- In this unit, we are going to use only uniform probability distribution that was mentioned in the previous lecture, and you should already have a good understanding on how that works
COMP3010 - 9.1 - Probabilistic Analysis
By Daniel Sutantyo
COMP3010 - 9.1 - Probabilistic Analysis
- 120