COMP3010 - 12.0 - Approximation Algorithm

COMP3010: Algorithm Theory and Design

Daniel Sutantyo, Department of Computing, Macquarie University

12.0 - Approximation Algorithms

Where are we?

12.0 - Approximation Algorithms

P vs NP
- NP-hard
- NP-complete (the one that matters the most)
Reduction
- polynomial time reduction
- we reduce an NP-complete problem into a problem to show that the problem is NP-hard (or NP-complete)
  - A \(\le_p\) B
  - A is an NP-complete problem
  - B is the problem we want to show to be NP-hard

Where are we?

12.0 - Approximation Algorithms

Remember that we are not discussing P vs NP because we want to prove that P = NP or that P \(\ne\) NP
Our main goal is to get you to recognise hard (i.e. NP) problems so that you do not spend time trying to find an efficient (i.e. polynomial-time) solution for it

Where are we?

12.0 - Approximation Algorithms

We use reduction to show that a problem is hard, by reducing an NP-complete problem to it
- so now at least you can tell your boss that the problem is hard
- but now what?

Where are we?

12.0 - Approximation Algorithms

Generally, you have three options:
1. Give up
  - as in, use the exponential time algorithm to solve it (maybe the input size is small enough)
  - use branch-and-bound and/or dynamic programming
  - how often do you have the worst-case anyway?
2. Consider only special cases
  - e.g. 2-SAT vs 3-SAT, directed acyclic graph, Euler vs Hamiltonian cycle
3. Use an approximation algorithm

Approximation Algorithms

12.0 - Approximation Algorithms

Suppose that you are working on an optimisation problem where each solution is a positive numerical value
- a lot of optimisation problems are either maximisation or minimisation problems
  - travelling salesman: find the cheapest route
  - knapsack: find the maximum value of items you can carry
An approximation algorithm is an algorithm that may produce a solution that is suboptimal
- can it produce the optimal solution?
  - yes, sometimes

Approximation Ratio

12.0 - Approximation Algorithms

How do we know if an approximation algorithm is good or bad?
For any input of size \(n\):
- let \(C\) be the solution produced by the approximation algorithm
- let \(C^*\) be the optimal solution
The approximation ratio \(\rho(n)\) of an approximation algorithm is the ratio between \(C\) and \(C^*\)
- if our problem is a maximisation problem, then \(0 \le C \le C^*\), and
  - \(C^*/C \le \rho(n)\)
- if our problem is a minimisation problem, then \(0 \le C^* \le C\), and
  - \(C/C^* \le \rho(n)\)

Approximation Ratio

12.0 - Approximation Algorithms

For maximisation problems, \(C^*/C \le \rho(n)\)
- the optimal solution is going to be at most \(\rho(n)\) times the solution of the approximation algorithm
For minimisation problems, \(C/C^* \le \rho(n)\)
- the solution to the approximation algorithm is going to be at most \(\rho(n)\) times the optimal solution

Approximation Ratio

12.0 - Approximation Algorithms

You can think of \(\rho(n)\) as a performance guarantee, that is, we guarantee that the result produced by our approximation algorithm is not going to be worse than a factor of \(\rho(n)\) compared to the optimal solution
We say that our approximation algorithm is a \(\rho(n)\)-approximation algorithm
For example, an \(n\)-approximation algorithm means that the result of our approximation algorithm is not going to be worse than \(n\) times the optimal solution (for a minimisation problem)
- so if \(n =\) 100, and the optimal solution is 7, our answer is not going to be worse than 700
  - that is actually pretty horrific!

Approximation Ratio

12.0 - Approximation Algorithms

In this unit, we will consider only approximation algorithms with a constant \(\rho(n)\) and one that runs in polynomial time
- e.g. a 2-approximation algorithm means that no matter what \(n\) is, our solution will be
  - no more than twice the optimal solution (for minimisation problems) or
  - no less than half the optimal solution (for maximisation problems)
- in this case, we can drop the \(n\), and say it is a \(\rho\)-approximation algorithm
What do you think a 1-approximation algorithm is?

Approximation Algorithms vs Heuristics vs Probabilistic

12.0 - Approximation Algorithms

How is approximation algorithms different to heuristics and probabilistic algorithms
- heuristics: gut-feeling, intuition
  - "I cannot prove that this works, but somehow it does most of the time"
- all three are similar, trade off accuracy for performance, but
  - probabilistic algorithm: it is probabilistic, there is a random element in the algorithm
  - heuristics: no performance guarantee

Approximation Schemes

12.0 - Approximation Algorithms

CLRS also mentions polynomial-time approximation scheme where in addition to the input to the problem, we also take a constant \(\epsilon > 0\)
- for any fixed \(\epsilon\), the scheme is a \((1+\epsilon)\)-approximation scheme that runs in polynomial time in the size of \(n\) (the size of the input)
- for example, the running time can be \(O(n^{2/\epsilon})\)
  - so as you get more precise, the running time gets worse
  - you can think of it as a customisable approximation algorithm, as in, we get to choose how good of an approximation we get
You don't have worry about this

What do you need to know?

12.0 - Approximation Algorithms

Approximation algorithm:
- runs in polynomial time
- has a performance guarantee (the approximation ratio \(\rho\))
- the topic is hard because every approximation algorithm is different, depending on the problem
  - what I expect from you is to be able to understand how an approximation algorithm works, and then work out its approximation ratio
  - showing the approximation ratio is the hard part of this topic
- the topic is easy because, well, the approximation algorithms we're going to see are simple

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Remember that the decision version is NP-complete and the optimisation version is NP-hard

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Remember that the decision version is NP-complete and the optimisation version is NP-hard

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Remember that the decision version is NP-complete and the optimisation version is NP-hard

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Remember that the decision version is NP-complete and the optimisation version is NP-hard

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Here is a 2-approximation algorithm, given a graph \(G\langle V,E \rangle\):
- let \(E^\prime = E\), \(C = \{\}\)
- while \(E^\prime\) is nonempty:
  - pick an edge \((u,v)\) from \(E^\prime\) and add \(u\) and \(v\) to \(C\)
  - remove any edge in \(E^\prime\) that is connected to either \(u\) or \(v\)
- return C

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

\(C = \{b,c,d,e,f,g\}\)

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

\(C = \{b,c,d,e,f,g\}\)

\(C^* = \{b,e,d\}\)

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

It is a very simple algorithm, but it comes with a performance guarantee, and this is the hard part of this topic
We can show that this approximation algorithm is a 2-approximation algorithm, meaning that the solution it produces will not be worse than 2 times the optimal solution (since it is a minimisation problem)

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Proof:
- does it run in polynomial time?
  - each iteration, we pick one edge, and then add the vertices to the set \(C\), and since we can add at most \(V\) vertices, this is \(O(V)\)
  - each iteration, we have to remove the edges connected to \(u\) and \(v\), so at most this is \(O(E)\)
  - complexity is \(O(E^2 + V\)
    - or \(O(E\log E + V)\) if you use some sort of priority queue
    - or \(O(E + V)\) if you use an adjacency matrix
    - the point is, it is polynomial in complexity

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Proof:
- does it return a correct answer?
  - yes, we remove an edge from consideration only if it is already covered by the vertices in \(C\), so at the termination of the algorithm, since \(E^\prime\) is empty, we have covered every single edge with the vertices in \(C\)

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Proof:
- what is the approximation ratio?
  - claim that it is a 2-approximation algorithm
  - let \(A\) be the set of edges that we pick:
    - if we want to cover the edges in this set, how many vertices do we need?

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Proof:
- what is the approximation ratio?
  - claim that it is a 2-approximation algorithm
  - let \(A\) be the set of edges that we pick:
    - if we want to cover the edges in this set, how many vertices do we need?
    - each vertex only occurs once, so a vertex cover for
      the edges in \(A\) must have at least \(|A|\) vertices
    - \(|A|\) is the lower bound for the size of the minimum
      vertex cover, that is
      \(|C^*| \ge |A|\)

Example: Minimum Vertex Cover

12.0 - Approximation Algorithms

Proof:
- so far we have \(|C^*| \ge |A|\)
- now what can you say about \(|C|\)?
  - when we terminate the algorithm, how many vertices do we have in \(C\)?
    - each edge in \(A\) corresponds to two vertices,
      so \(C\) will have exactly \(2 * |A|\) vertices
- put the two together
  - \[\begin{aligned}|C| &= 2 * |A|\\ &\le 2 *|C^*| \end{aligned}\]
  - so, at worst, we will have twice as many vertices than the
    optimal answer

How To Prove It

12.0 - Approximation Algorithms

The proof technique is the standard method to show that an algorithm is a \(\rho\)-approximation algorithm
The idea is to tie in the lower bound for the optimal solution to the result of the approximation algorithm (for a minimisation problem)
- e.g.
  - the optimal solution must use at least or at most \(k\) of something (assume the lowest/highest possible)
    - \(C^* \ge k\) or \(C^* \le k\)
  - the approximation algorithm solution uses exactly \(2*k\) or \(k/2\) of the same resource
    - \(C = 2k \le 2 * C^*\) or \(C = k/2 \ge C^*/2 \rightarrow 2*C \ge C^* \)

How To Prove It

12.0 - Approximation Algorithms

Notice something important here:
- you don't need to know what the optimal solution is, just the lower bound for it (or the upper bound, for a minimisation problem)

Summary

12.0 - Approximation Algorithms

\(\rho\)-approximation algorithm
What is going to be assessed:
- you will be given an NP problem and an approximation algorithm for the problem
- you need to show that the approximation algorithm is a \(\rho\)-approximation algorithm using the steps described in the example
We are going to do a bit more of this in the workshop, but really, that is all we're going to do in this topic, so the concept is easy, the execution is the harder part.