COMP3010: Algorithm Theory and Design

Daniel Sutantyo,  Department of Computing, Macquarie University

5.2 - Greedy Algorithm -  Activity-Selection Problem

Problem description

5.2 - Activity-Selection Problem

  • You have a list of \(n\) activities you can perform, but all of these activities use the same resource, so you cannot perform two of these activities at the same time
    • Examples:
      • one classroom for multiple classes
      • one doctor, multiple appointments
  • You want to do as many activities as you can, which ones should you choose? 

Problem description

5.2 - Activity-Selection Problem

0

2

4

6

8

10

12

14

16

time

Problem description

5.2 - Activity-Selection Problem

0

2

4

6

8

10

12

14

16

  • In problem-solving, try to generate examples that will invalidate your trivial solutions, so you don't spend too much time working on a wrong approach

Problem description

5.2 - Activity-Selection Problem

0

2

4

6

8

10

12

14

16

activity

start

finish

4        5        6        7        9        9       10      11      12      14      16

1        3        0        5        3        5        6        8        8        2       12

1        2        3        4        5        6        7        8        9       10      11

Problem description

5.2 - Activity-Selection Problem

0

2

4

6

8

10

12

14

16

activity

start

finish

4        5        6        7        9        9       10      11      12      14      16

1        3        0        5        3        5        6        8        8        2       12

1        2        3        4        5        6        7        8        9       10      11

Problem description

5.2 - Activity-Selection Problem

0

2

4

6

8

10

12

14

16

activity

start

finish

4        5        6        7        9        9       10      11      12      14      16

1        3        0        5        3        5        6        8        8        2       12

1        2        3        4        5        6        7        8        9       10      11

Problem description

5.2 - Activity-Selection Problem

  • Let \(S = \{a_1, a_2, \dots, a_n\}\) be the set of \(n\) activities ordered by their finish times (in increasing order)
  • Each activity \(a_i\) has start time \(s_i\) and finish time \(f_i\), \(0 \le s_i < f_i\) 
    • \(a_i\) takes place during half-open time interval \([s_i,f_i)\)
      • \(x \in [s_i,f_i)\) means \(s_i \le x < f_i\)
    • activities \(a_i\) and \(a_j\) are compatible if \([s_i,f_i)\) and \([s_j,f_j)\) do not overlap
      • i.e. \(f_i < s_j\) or \(s_i \ge f_j\)

Problem description (formal)

5.2 - Activity-Selection Problem

  • Input:
    • The set \(S = \{a_1, a_2, \dots, a_n\}\) of \(n\) activities, sorted in increasing order of finish time
    • The start time \(s_i\) and finish time \(f_i\) for each activity \(a_i\)
  • ​Output:
    • ​The maximum subset \(A = \{a_{\ell_1}, a_{\ell_2}, \dots, a_{\ell_m}\} \subseteq S\) of compatible activities, where two activities \(a_i\) and \(a_j\) are compatible if \([s_i,f_i)\) and \([s_j,f_j)\) do not overlap, i.e. \(f_i < s_j\) or \(s_i \ge f_j\)

Problems and Subproblems?

5.2 - Activity-Selection Problem

0

2

4

6

8

10

12

14

16

activity

start

finish

4        5        6        7        9        9       10      11      12      14      16

1        3        0        5        3        5        6        8        8        2       12

1        2        3        4        5        6        7        8        9       10      11

Problems and Subproblems

5.2 - Activity-Selection Problem

\( [  ] \)

\( [ a_1 ] \)

\( [ a_1, a_2 ] \)

\( [ a_1, a_2, a_3 ] \)

\( [  ] \)

\( [ a_1 ] \)

\( [ a_2 ] \)

\( [  ] \)

\( [ a_1, a_2 ] \)

\( [ a_1, a_3] \)

\( [ a_1 ] \)

\( [ a_2, a_3] \)

\( [ a_2] \)

\( [  a_3] \)

\( [  ] \)

pick \(a_1\)?

pick \(a_2\)?

pick \(a_3\)?

Problems and Subproblems

5.2 - Activity-Selection Problem

\(\text{select}(S) \)

pick \(a_1\)?

pick \(a_2\)?

  • Let's call the problem, \(\text{select}(S)\) where \(S\) is a set of activities (remember that we want to return the set of compatible activities)

\(\text{select}(S\setminus\{a_1\}) \)

\(\text{select}(S\setminus\{a_1,a_2\}) \)

\(\text{select}(S\setminus\{a_1\}) \)

\(\text{select}(S\setminus\{a_1\}) \)

\(\text{select}(S\setminus\{a_1\}) \)

\(\text{select}(S) \)

\(\text{select}(S) \)

Problems and Subproblems

5.2 - Activity-Selection Problem

  • We can do better: if we pick \(a_k\), then we cannot pick another activity that runs on the same time, i.e. whatever activity we choose next must either finish before \(a_k\) starts or starts after \(a_k\) finishes
  • Let \(S_{i,j}\) be the set of activities that start after activity \(a_i\) finishes, and finishes before activity \(a_j\) starts

0

2

4

6

8

10

12

14

16

\(S_{1,11}\)

\(a_1\)

\(a_{11}\)

Problems and Subproblems

5.2 - Activity-Selection Problem

  • Let \(S_{i,j}\) be the set of activities that start after activity \(a_i\) finishes, and finishes before activity \(a_j\) starts
  • So once we pick an activity \(a_k\), we can only pick \(S_{?,k}\) and \(S_{k,?}\)
    • hmm, how does this notation work?

0

2

4

6

8

10

12

14

16

\(S_{4,?}\)

\(a_{4}\)

\(S_{?,4}\)

Problems and Subproblems

5.2 - Activity-Selection Problem

  • Here's the problem, what is the value of \(i\) and \(j\) for \(S_{i,j}\) that denotes all the activities before or after \(a_k\),
    • or more simply: how do you denote the set of all activities?
      • we have 11 activities, so \(S_{1,11}\) denotes the set of activities that starts after activity 1 finishes and finishes before activity 11 starts
    • solution: make a couple of dummy/phantom activities:
      • \(a_0\), finishes before \(a_1\) starts
      • \(a_{12}\), starts after \(a_{11}\) finishes
    • so
      • \(S_{0,k}\) denotes all the activities the finishes before \(a_k\) starts (but after this phantom activity finishes)
      • \(S_{k,12}\) denotes all the activities that starts after \(a_k\) finishes (but before this phantom activity starts)
      • \(S_{0,12}\) denotes the set of all activities

Problems and Subproblems

5.2 - Activity-Selection Problem

0

2

4

6

8

10

12

14

16

\(S_{4,12}\)

\(a_{4}\)

\(S_{0,4}\)

  • Make sure you understand the notation, it may be a little unusual, e.g. \(S_{0,2}\) is NOT \(\{a_1\}\), it's actually empty (because nothing finishes before \(a_2\) starts)
  • \(S_{0,1}\) and \(S_{11,12}\) are empty sets

Problems and Subproblems

5.2 - Activity-Selection Problem

\(\text{select}(S_{0,12})\)

\(\text{select}(S_{0,1}) + \text{select}(S_{1,12})\)

\(\text{select}(S_{0,11}) + \text{select}(S_{11,12})\)

\(\text{select}(S_{0,2}) + \text{select}(S_{2,12})\)

\(\text{select}(S_{0,10}) + \text{select}(S_{10,12})\)

pick \(a_1\)

\(\dots\)

  • The search tree can be written more nicely now:

pick \(a_2\)

pick \(a_{10}\)

pick \(a_{11}\)

Problems and Subproblems

5.2 - Activity-Selection Problem

  • Of course you can generalise this:
    • if we start with \(S_{i,j}\), then if we pick \(a_k\), we have the subproblems \(S_{i,k}\) and \(S_{k,j}\)

\(\text{select}(S_{i,j})\)

\(\text{select}(S_{i,i+1}) + \text{select}(S_{i+1,j})\)

\(\text{select}(S_{i,j-1}) + \text{select}(S_{j-1,j})\)

\(\text{select}(S_{i,i+2}) + \text{select}(S_{i+2,j})\)

\(\text{select}(S_{i,k}) + \text{select}(S_{k,j})\)

pick \(a_{i+1}\)

pick \(a_{j-1}\)

pick \(a_{i+2}\)

\(\dots\)

\(\dots\)

pick \(a_{k}\)

Problems and Subproblems

5.2 - Activity-Selection Problem

  • Finally, let \(A_{i,j}\) be the optimal solution \(\text{select}(S_{i,j})\), that is, \(A_{i,j}\) is the maximum set of compatible activities in \(S_{i,j}\)

0

2

4

6

8

10

12

14

16

\(S_{2,11}\)

\(a_{4}\)

\(a_{8}\)

\(A_{2,11}\)

Optimal Substructure

5.2 - Activity-Selection Problem

  • Let \(A_{i,j}\) be the optimal solution for \(S_{i,j}\) and suppose that \(a_k \in A_{i,j}\)
    • we picked \(a_k\), so we have the solutions for the subproblems \(S_{i,k}\) and \(S_{k,j}\)
    • let \(A_{i,k} = A_{i,j} \cap S_{i,k}\) and \(A_{k,j} = A_{i,j}\cap S_{k,j}\) be the solutions to these subproblems

\(\text{select}(S_{i,j})\)

\(\text{select}(S_{i,k})\)

\(\text{select}(S_{k,j})\)

optimal solution: \(A_{i,j}\)

optimal solution: \(A_{i,k}\)

optimal solution: \(A_{k,j}\)

  • i.e. \(A_{i,j} = A_{i,k} \cup \{a_k\} \cup A_{k,j} \) and the optimal solution is \(|A_{i,j}| = |A_{i,k}| + 1 +  |A_{k,j}| \)

Optimal Substructure

5.2 - Activity-Selection Problem

  • Suppose there is a better solution for the subproblem \(S_{i,k}\), say \(A^\prime_{i,k}\)
  • ​This means that \(|A^\prime_{i,k}| > |A_{i,k}|\), so \(|A^\prime_{i,k}| + 1 +  |A_{k,j}|  > |A_{i,j}|\)
    • this contradicts the assumption that \(A_{i,j}\) is the optimal solution to the problem
  • ​We can use a symmetric argument for the solution to the subproblem \(S_{k,j}\)

 

 

\(\text{select}(S_{i,j})\)

\(\text{select}(S_{i,k})\)

\(\text{select}(S_{k,j})\)

optimal solution: \(A_{i,j}\)

solution: \(A_{i,k}\)

optimal solution: \(A_{k,j}\)

optimal solution: \(A^\prime_{i,k}\)

Recursive Relation

5.2 - Activity-Selection Problem

\(\text{select}(S_{i,j})\)

\(\text{select}(S_{i,i+1}) + \text{select}(S_{i+1,j})\)

\(\text{select}(S_{i,j-1}) + \text{select}(S_{j-1,j})\)

\(\text{select}(S_{i,i+2}) + \text{select}(S_{i+2,j})\)

\(\text{select}(S_{i,k}) + \text{select}(S_{k,j})\)

pick \(a_{i+1}\)

pick \(a_{j-1}\)

pick \(a_{i+2}\)

\(\dots\)

\(\dots\)

pick \(a_{k}\)

\[|A_{i,j}| = \begin{cases}\displaystyle{\max_{a_k\in S_{i,j}}} \left(|A_{i,k}| + 1 + |A_{k,j}|\right) & \text{if $S_{i,j}$ is not empty}\\             0  & \text{if $S_{i,j}$ is empty} \end{cases}\]

Greedy Choice

5.2 - Activity-Selection Problem

  • What is the greedy choice?
    • we want to maximise the number of activities
    • which activity, if chosen, leaves the as much resource as possible so we can fit in more activities?

Greedy Choice

5.2 - Activity-Selection Problem

0

2

4

6

8

10

12

14

16

  • Pick the one that finishes earliest
  • Remember that the activities are sorted according to which one finishes first, so \(a_1\) will finish before \(a_2\)

Greedy Choice

5.2 - Activity-Selection Problem

0

2

4

6

8

10

12

14

16

  • Alternatively, pick the one that starts latest

Greedy Choice

5.2 - Activity-Selection Problem

  • In dynamic programming, the hard part is in working out what the subproblems are, and then defining the recursive relationship
    • overlapping subproblems are usually easy to spot
    • proof for optimal substructure are all very similar
  • In greedy algorithm, the hard part is in working out what is the greedy choice
    • can you think of a bad greedy choice for the activity selection problem?
    • we mentioned this at the start

Greedy Choice

5.2 - Activity-Selection Problem

  • Remember the steps (this is the first approach)
    • Show that there is an optimal substructure
    • Show the recursive relation that gives optimal solution
    • Show that if we make the greedy choice, only one subproblem remains
    • Show that it is safe to make the greedy choice
    • Find the value of the optimal solution (run the algorithm)
    • Develop the recursive solution to an iterative one

Greedy Choice

5.2 - Activity-Selection Problem

  • Alternatively:
    • Show that it is safe to make a greedy choice such that our problem only has one subproblem to solve
    • Show that the problem has an optimal substructure after we make the greedy choice
    • Find the value of the optimal solution (run the algorithm)
    • Develop the recursive solution to an iterative one

Greedy Choice

5.2 - Activity-Selection Problem

  • Show that if we make the greedy choice then only one subproblem remains:
    • if we pick activity \(a_k\), then we need to find the solutions to the subproblem \(S_{i,k}\) and \(S_{k,j}\)
    • the greedy choice is \(a_{\ell}\), the activity in \(S_{i,k}\) that will finish first
    • \(S_{i,i+1} = \emptyset\)
    • therefore there is only one subproblem to solve

Greedy Choice

5.2 - Activity-Selection Problem

  • Intuition:
    • pick the activity that finishes first, i.e. \(a_1\)
      • recall that the activities are sorted according to finish time, in increasing order
    • suppose \(a_1\) is not in the optimal solution
      • there must be another activity, say \(a_k\) that is the first activity in the optimal solution
      • \(a_k\) finishes later than \(a_1\)
      • so we can substitute \(a_1\) in for \(a_k\) and still get the optimal number of activities

Greedy Choice

5.2 - Activity-Selection Problem

  • Theorem:
    • If \(a_m\) is an activity in \(S_{k,j}\) with the earliest finish time, then \(a_m\) is in a maximum-size subset of compatible activities of \(S_{k,j}\) (i.e. \(a_m \in A_{k,j}\)) 
  • Proof:
    • let \(a_\ell\) be the activity in \(A_{k,j}\) with the earliest finish time.
    • if \(a_\ell = a_m\), then we are done
    • if \(a_\ell \ne a_m\), then replace \(a_\ell\) with \(a_m\)
      • i.e. let \(A^\prime_{k,j} = A_{k,j} - \{a_\ell\} + \{a_m\}\)
    • Since the activities in \(A_{k,j}\) are compatible, and \(a_\ell\) is the first activity in \(A_{k,j}\) to finish and \(f_m \le f_\ell\),
      • then it follows that the activities in \(A^\prime_{k,j}\) are also compatible
    • Since \(|A^\prime_{k,j}| = | A_{k,j}|\)
      • it follows that \(A^\prime_{k,j}\) is also a maximum-size subset of compatible activities of \(S_{k,j}\)

Recursive Solution

5.2 - Activity-Selection Problem

// int s[] is the array containing start times
// int f[] is the array containing finish times
// remember that the arrays are sorted by finishing time
// so f[m] < f[m+1], i.e. activity m finishes before activity m+1

// calling select(k,n), select only activities that starts after k finishes
// activity n is the phantom activity
HashSet<Integer> select(int k, int n){}{
  int m = k+1;
  // find the next activity that starts after activity k finishes
  while (m <= n && s[m] < f[k])
  	m++:
  // add the activity to the set of solution
  if (m <= n){
    HashSet<Integer> ans = new HashSet<>(select(m,n));
    ans.add(m);
    return ans;
  }
}

Iterative Solution

5.2 - Activity-Selection Problem

// int s[] is the array containing start times
// int f[] is the array containing finish times

HashSet<Integer> select(){
  int n = s.length;
  HashSet<Integer> answer = new HashSet<Integer>();
  answer.add(1);
  k = 1;
  for(int m = 2; m < n; m++){
    if (s[m] >= f[k]){
      answer.add(m);
      k = m;
    }
  }
  return answer;
}