COMP3010: Algorithm Theory and Design

Daniel Sutantyo,  Department of Computing, Macquarie University

6.2 - Substitution Method

Substitution Method

6.2 - Substitution Method

  • It's another method for solving recurrences to find the time complexity of an algorithm
  • It's done using induction
    • we need to guess the solution, and then prove that it is correct 
    • what can we use to make this guess? 
      • recursive-tree method
  • Why?
    • since sometimes we may not be able to draw the recursion tree nicely
    • in case we don't have to be that exact (\(\Omega, O\))
      • notice that with the recursion-tree method, we're often counting the exact number of operation (differs by constant), i.e. \(\Theta(g(n))\)

Substitution Method

  • What it is not:
    • you won't be substituting in things
    • \(T(n) = 2T(n/2) + n\)
    •          \(=2(2T(n/4) + n/2) + n\)
    •          \(=2(2(2T(n/8)+n/4)+n/2+n\)
    •          ... (some magic)
    •          \(= n\log n\) 
  • If you get asked to prove something using the substitution method, then you need to use induction

6.2 - Substitution Method

Substitution Method

  • There are only two steps:
    • guess the time complexity of the algorithm
    • use induction to find the constants and show that the time complexity we guessed is correct

6.2 - Substitution Method

Mergesort

\(T(n) = \begin{cases} \Theta(1) &\text{if $n = 1$}\\ 2T(n/2) + cn &\text{if $n > 1$} \end{cases}\)

  • Step 1: Guess the time complexity
    • draw the recursion tree and estimate the number of operations
    • try different bounds and see which one works (e.g. try \(n^2\), then \(n\), then \(n \log n\))
    • heuristic or experience (is there another algorithm that is similar?)
    • use the master theorem (next lecture)
  • For mergesort, let's guess
    • \(T(n) = O(n \log n)\) (let's use log based 2)

6.2 - Substitution Method

Mergesort

  • Step 2: Use induction to show that mergesort is \(O(n \log_2 n)\)
  • Prove \(T(n) = O(n\log n)\) by induction
    • i.e. prove there exists \(c > 0\) and \(n_0 > 0\) such that \(T(n) \le cn\log n\) for all \(n > n_0\)
    • start with the base case
  • Base case: \(n = 1\)
    • LHS:\(T(1) = 1\)
    • RHS: \(cn\log n = c * \log 1 = 0\)

\(T(n) = \begin{cases} 1 &\text{if $n = 1$}\\ 2T(n/2) + cn &\text{if $n > 1$} \end{cases}\)

6.2 - Substitution Method

Mergesort

  • we don't have to start from 1!
  • Base case:
    • \(n = 2\)
      • LHS: \(T(2) = 2*T(1)+2 = 4\)
      • RHS: \(cn\log n =2c * \log 2\)
    • \(n = 3\)
      • LHS: \(T(3) = 2*T(1)+3 = 5\)
      • RHS: \(cn\log n =3c * \log 3\)

6.2 - Substitution Method

\(T(n) = \begin{cases} 1 &\text{if $n = 1$}\\ 2T(n/2) + cn &\text{if $n > 1$} \end{cases}\)

Mergesort

  • \(n = 4\)
    • LHS: \(T(4) = 2*T(2)+4 = 12\)
    • RHS: \(cn\log n =4c * \log 4\)
  • \(n = 5\)
    • LHS: \(T(5) = 2*T(2)+5 = 13\)
    • RHS: \(cn\log n =5c * \log 5\)
  • \(n = 6\)
    • LHS: \(T(6) = 2*T(3)+6 = 16\)
    • RHS: \(cn\log n =6c * \log 6\)

6.2 - Substitution Method

once you got to \(n = 4\) onward, you don't need to depend on \(T(1)\) anymore

\(T(n) = \begin{cases} 1 &\text{if $n = 1$}\\ 2T(n/2) + cn &\text{if $n > 1$} \end{cases}\)

Mergesort

  • Base case:
    • \(n = 2\)
      • LHS: \(T(2) = 2*T(1)+2 = 4\)
      • RHS: \(cn\log n =2c * \log 2\)
    • \(n = 3\)
      • LHS: \(T(3) = 2*T(1)+3 = 5\)
      • RHS: \(cn\log n =3c * \log 3\)
    • true for \(c \ge 2\)

6.2 - Substitution Method

\(T(n) = \begin{cases} 1 &\text{if $n = 1$}\\ 2T(n/2) + cn &\text{if $n > 1$} \end{cases}\)

Mergesort

  • Prove \(T(n) \le cn\log n\) for all \(n > n_0\) for some \(c > 0\) and \(n_0 > 0\)
  • Induction step:
    • Induction hypothesis: assume \(T(k) \le ck\log k\) for some \(k > 1\)
    • Prove: \(T(2k)\le 2ck\log 2k\)

\(T(2k) = 2*T(k) + 2ck\)

\(\le 2ck\log k + 2ck\)

(from the recurrence)

(from the induction hypothesis)

\(\le 2ck\log 2k + 2ck\)

6.2 - Substitution Method

\(T(n) = \begin{cases} 1 &\text{if $n = 1$}\\ 2T(n/2) + cn &\text{if $n > 1$} \end{cases}\)

Mergesort

  • we want to prove that
    • \(T(2k) \le 2ck\log{2k}\)
  • but what we can prove is
    • \(T(2k) \le 2ck \log 2k + 2ck\)
  • these are not the same, because it could be the case that
    • \(2ck\log 2k \le T(2k) \le 2ck\log 2k + 2ck\)
  • the problem is the positive term \(2ck\)

6.2 - Substitution Method

\(T(n) = \begin{cases} 1 &\text{if $n = 1$}\\ 2T(n/2) + cn &\text{if $n > 1$} \end{cases}\)

Mergesort

  • Prove \(T(n) \le cn\log n\) for all \(n > n_0\) for some \(c > 0\) and \(n_0 > 0\)
  • Induction step:
    • Induction hypothesis: assume \(T(\frac{k}{2}) \le \frac{ck}{2}\log\frac{k}{2}\) for some \(k > 1\)
    • Prove: \(T(k)\le ck\log k\)

\(T(k) = 2*T(\frac{k}{2}) + ck\)

\(\le ck\log \frac{k}{2} + ck\)

(from the recurrence)

(from the induction hypothesis)

\(= ck\log k - ck\log2 + ck = ck\log k\) 

6.2 - Substitution Method

(since we're using log base 2)

\(T(n) = \begin{cases} 1 &\text{if $n = 1$}\\ 2T(n/2) + cn &\text{if $n > 1$} \end{cases}\)

Linear Search

\(T(n) = \begin{cases} 1 &\text{if $n \le 2$}\\3T(n/3) + 2 &\text{if $n > 2$} \end{cases}\)

6.2 - Substitution Method

  • Use induction to show that linear search is \(O(n)\)
  • Prove \(T(n) = O(n)\) by induction
    • i.e. prove there exists \(c > 0\) and \(n_0 > 0\) such that \(T(n) \le cn\) for all \(n > n_0\)
    • start with the base case
  • Base case: \(n = 1\)
    • LHS:\(T(1) = 1\)
    • RHS: \(cn = c * 1 = 1\)

Linear Search

6.2 - Substitution Method

  • \(n = 2\)
    • LHS: \(T(2) = 1\)
    • RHS: \(cn = 2c\)
  • \(n = 3\)
    • LHS: \(3*1 + 2 = 5\)
    • RHS: \(cn= 3c\)
  • \(n = 4\)
    • LHS: \(3*1 + 2 = 5\)
    • RHS: \(cn=4c\)

\(T(n) = \begin{cases} 1 &\text{if $n \le 2$}\\3T(n/3) + 2 &\text{if $n > 2$} \end{cases}\)

Linear Search

6.2 - Substitution Method

\(T(n) = \begin{cases} 1 &\text{if $n \le 2$}\\3T(n/3) + 2 &\text{if $n > 2$} \end{cases}\)

  • Prove \(T(n) \le cn\) for all \(n > n_0\) for some \(c > 0\) and \(n_0 > 0\)
  • Induction step:
    • Induction hypothesis: assume \(T(k/3) \le ck/3\) for some \(k > 1\)
    • Prove: \(T(k)\le ck\)

\(T(k) = 3T(k/3) + 2\)

\(\le ck + 2\)

(from the recurrence)

(from the induction hypothesis)

can we say \(T(k) \le ck\) ?

\(= ck + 2\)

Linear Search

6.2 - Substitution Method

\(T(n) = \begin{cases} 1 &\text{if $n \le 2$}\\3T(n/3) + 2 &\text{if $n > 2$} \end{cases}\)

  • So let's try to prove the STRONGER bound
  • Induction step:
    • Induction hypothesis: assume \(T(k/3) \le ck/3-d\) for some \(k > 1\)
    • Prove: \(T(k)\le ck - d\)

\(T(k) = 3T(k/3) + 2\)

\(\le 3(ck/3 - d) + 2\)

(from the recurrence)

(from the induction hypothesis)

can we say \(T(k) \le ck - d\) now? Yes, for \(d \ge 1\)

\(= ck -3d  + 2\)

so \(T(n) \le cn - 1\), hence \(T(n)= O(n)\)

  • Prove \(T(n) \le cn-d\) for all \(n > n_0\) for some \(c > 0\), \(d > 0\) and \(n_0 > 0\)

Subtleties

6.2 - Substitution Method

\(T(n) = \begin{cases} 1 &\text{if $n \le 2$}\\3T(n/3) + 2 &\text{if $n > 2$} \end{cases}\)

  • Try proving a weaker bound instead (e.g. \(T(n) \le cn - d\))
  • Try changing the induction hypothesis (e.g. from \(2k \rightarrow k\) to  \(k \rightarrow k/2\))
  • Showing that \(T(n) \le cn + 1\) does not prove that \(T(n) \le cn\)

\(T(n) = \begin{cases} 1 &\text{if $n = 1$}\\ 2T(n/2) + cn &\text{if $n > 1$} \end{cases}\)

Another example

\(T(n) = \begin{cases} 1 &\text{if $n \le 3$}\\3T({n/4}) + n^2 &\text{if $n > 3$} \end{cases}\)

6.2 - Substitution Method

  • Guess the complexity: \(O(n^2)\)
  • Prove \(T(n) = O(n^2)\) by induction
    • i.e. prove there exists \(c > 0\) and \(n_0 > 0\) such that \(T(n) \le cn^2\) for all \(n > n_0\)
    • start with the base case
  • Base case: \(n = 1\) (and also \(n=2, 3\))
    • LHS:\(T(1) = 1\)
    • RHS: \(cn^2 = c * 1 = 1\)

Another example

6.2 - Substitution Method

  • \(n = 4\)
    • LHS: \(3 * 1 + 16 = 19\)
    • RHS: \(cn^2 = c * 16\)
  • \(n = 8\)
    • LHS: \(3*7 + 64 = 81\)
    • RHS: \(cn^2 = c * 64\)
  • \(n = 16\)
    • LHS: \(3*81 + 256 = 499 \)
    • RHS: \(cn^2=c * 256\)

\(T(n) = \begin{cases} 1 &\text{if $n \le 3$}\\3T({n/4}) + n^2 &\text{if $n > 3$} \end{cases}\)

Another example

6.2 - Substitution Method

  • Induction step:
    • Induction hypothesis: assume \(T(\frac{k}{4}) \le c\left(\frac{k}{4}\right)^2\) for some \(k > 1\)
    • Prove: \(T(k)\le ck^2\)

\(T(k) = 3T(\frac{k}{4}) + k^2\)

\(\le 3ck^2/16 + k^2\)

(from the recurrence)

(from the induction hypothesis)

\(= (3c/16+1)k^2\)

\(T(n) = \begin{cases} 1 &\text{if $n \le 3$}\\3T({n/4}) + n^2 &\text{if $n > 3$} \end{cases}\)

Therefore \(T(k) \le c k^2\) with \(c = 2\) since \(3c/16+1 < 2\), hence \(T(k) = O(k^2)\)

Summary

6.2 - Substitution Method

  • Another proof by induction
  • The result is not as precise as the recursion-tree method
    • we normally use induction to show upper/lower bound (i.e \(O\) or \(\Omega\), not tight bound (i.e. \(\Theta\))
  • The kind of question you may expect: 
    • use the recursion-tree method to guess the complexity of an algorithm
    • use induction to prove that this is correct

COMP3010 - 6.2 - Substitution Method

By Daniel Sutantyo

COMP3010 - 6.2 - Substitution Method

Substitution method for finding the running time of a divide and conquer algorithm

  • 204