Catastrophic Cancellation

Numerical Methods

David Mayerich

Scalable Tissue Imaging and Modeling (STIM) Laboratory

Department of Electrical and Computer Engineering

Cullen College of Engineering

University of Houston

David Mayerich

STIM Laboratory, University of Houston

Loss of Precision

Losing Digits to Cancellation

Loss of Precision Theorem

David Mayerich

STIM Laboratory, University of Houston

Catastrophic Cancellation

  • Consider a floating point representation with a 5-digit mantissa:

David Mayerich

STIM Laboratory, University of Houston

n\ .\ n\ n\ n\ n\ \times 10^c
  • Apply a simple addition and subtraction:

3.2342 \times 10^3 + 2.51 \times 10^{-1} - 3.2343 \times 10^3
3234.451
0.151
0.2
3234.2
+
0.251
3234.451
3234.3
-
3234.5
3234.3
-

round

chop

Catastrophic Cancellation

  • This subtraction effectively reduced our numerical precision:

David Mayerich

STIM Laboratory, University of Houston

x=0.151 \quad \rightarrow \quad x=0.2
  • Each operation maintains the appropriate precision

  • However, we ended up with far less precision than we expect:

    • The result \(x=0.2\) has one digit of precision

    • The correct value \(x=0.151\) requires two digits of precision

    • We have \(5\) digits of precision, expect \(3\), but only get \(2\)

  • This is known as catastrophic cancellation

creating a relative error of

\epsilon_r = \frac{|0.2-0.151|}{|0.151|}\approx 32.5\%

Loss of Precision Theorem

David Mayerich

STIM Laboratory, University of Houston

Let \(x\) and \(y\) be normalized floating-point numbers such that \(x > y > 0\). For a base \(b\) value, we can find two adjacent natural numbers \(p,q \in \mathbb{N}\) such that:

b^{-p} \leq 1 - \frac{y}{x} \leq b^{-q} \quad \text{where} \quad q+1 = p

At most \(p\) and at least \(q\) significant digits are lost in the subtraction \(x-y\).

  • Calculate the number of digits lost when calculating \(x-y\) for:

x=37.593621 \quad \text{and} \quad y=37.584216

Lost Digits

David Mayerich

STIM Laboratory, University of Houston

You will lose between \(\left\lfloor \frac{-log\left(1 - \frac{y}{x} \right)}{\log b} \right\rfloor\) and \(\left\lceil \frac{-log\left(1 - \frac{y}{x} \right)}{\log b} \right\rceil\)

b^{-p} \leq 1 - \frac{y}{x} \leq b^{-q}

Loss of Precision Theorem

\log \left(b^{-p}\right) \leq \log \left(1 - \frac{y}{x}\right) \leq \log \left(b^{-q}\right)

Calculate the logarithm of all terms

-p \log \left(b\right) \leq \log \left(1 - \frac{y}{x}\right) \leq -q\log \left(b\right)

Separate exponents

p \log \left(b\right) \geq -\log \left(1 - \frac{y}{x}\right) \geq q\log \left(b\right)

Multiply by \(-1\) (and reverse inequalities)

p \leq \frac{-\log \left(1 - \frac{y}{x}\right)}{\log \left(b\right)} \leq q

Divide by \(\log(b)\)

Lost Digits

David Mayerich

STIM Laboratory, University of Houston

p \leq \frac{-\log \left(1 - \frac{y}{x}\right)}{\log \left(10\right)} \leq q
p \leq \frac{-\log \left(1 - \frac{37.584216}{37.593621}\right)}{\log \left(10\right)} \leq q
p \leq \frac{8.2933482}{\log \left(10\right)} \leq q
p \leq 3.6017554 \leq q
p = \left\lfloor 3.6017554 \right\rfloor = 3
q = \left\lceil 3.6017554 \right\rceil = 4
0\ 0\ .\ 0\ 0\ 9\ 4\ 0\ 5
  • Calculate the number of digits lost when calculating \(x-y\) for:

x=37.593621 \quad \text{and} \quad y=37.584216
3\ 7\ .\ 5\ 9\ 3\ 6\ 2\ 1
-\ 3\ 7\ .\ 5\ 8\ 4\ 2\ 1\ 6

Lost Bits

David Mayerich

STIM Laboratory, University of Houston

p = \left\lfloor 11.964772 \right\rfloor = 11
q = \left\lceil 11.964772 \right\rceil = 12
000000.00000010011010000110
x = 100101.10010111111101111001\\ y=100101.10010101100011110011
  • Calculate the number of bits lost when calculating \(x-y\) for:

x=37.593621 \quad \text{and} \quad y=37.584216
p \leq 11.964772 \leq q
p \leq \frac{-\log \left(1 - \frac{y}{x}\right)}{\log \left(2\right)} \leq q
p \leq \frac{-\log \left(1 - \frac{37.584216}{37.593621}\right)}{\log \left(2\right)} \leq q
p \leq \frac{8.2933482}{\log \left(2\right)} \leq q

Discussion: Precision After Subtraction

  • How many bits of precision are lost when calculating \(\alpha - \sin\alpha\) if \(\alpha = \frac{1}{64}\)?

David Mayerich

STIM Laboratory, University of Houston

p = \left\lfloor 14.58 \right\rfloor = 14
q = \left\lceil 14.58 \right\rceil = 15
p \leq \frac{-\log \left(1 - \frac{\sin\left(\frac{1}{64}\right)}{\frac{1}{64}}\right)}{\log \left(2\right)} \leq q
p \leq 14.58 \leq q
  • What precision can we expect if this calculation is performed on float32 registers?

23-15=8
standard C/C++ m bits x bits bias
binary16 single 10 5 15
binary32 float 23 8 127
binary64 double 53 11 1023
binary128 N/A 113 15 16383
binary256 N/A 19 237 262143

IEEE 754 Standard

Recovering Precision

Reformulation

Verifying Reformulations

The Quadratic Equation

David Mayerich

STIM Laboratory, University of Houston

Summary of Arithmetic Operations

  • Given \(d\) digits of precision:

    • multiplication: preserves \(d\) digits because the exponent and mantissa can be calculated independently

    • addition: exponents must be equal, so significant digits may be chopped

  • Evaluate \(x=1 - \sin 1.6\) using three digits of precision:

David Mayerich

STIM Laboratory, University of Houston

y = 1.00- \sin 1.61
= 0.00100
= 0.0007684
\epsilon_r=\frac{|0.000232|}{|0.000768|} = 0.30 \quad (30\%)

expected result (with 3 digits)

y = 1.00 - 0.9992316
y = 1.00 - 0.999
y = 1 - 0.9992316
\epsilon_a=|0.00100 - .000768| = .000232

Reformulation

  • Precision loss appears after subtraction (but the loss of data occurs earlier)

  • Reformulate expressions to remove subtractions

David Mayerich

STIM Laboratory, University of Houston

y = 1 - \sin x
y = \left( 1 - \sin x \right) \left( \frac{1+ \sin x}{1+ \sin x} \right)
y = \frac{1 - \sin^2 x}{1 + \sin x}
y = \frac{\cos^2 1.61}{1 + \sin 1.61}
y = \frac{\left(-0.0391936 \right)^2}{1.00 + 0.9992316}
y = \frac{0.00153664}{1.999}
y = .00077
\epsilon_r=\frac{|.000768-.00077|}{|0.000768|} = 0.0026 \quad (0.26\%)
\sin^2 x + \cos^2 x = 1
y = \frac{\cos^2 x}{1 + \sin x}
y = \frac{\left(-0.0392 \right)^2}{1.00 + 0.999}
y = \frac{0.00154}{2.00}

Reformulation (Part Deux)

  • When can catastrophic cancellation occur in the expression \(\sqrt{x+1}-\sqrt{x}\)?

  • When \(x\) is large -- fix it for \(x=10\):

David Mayerich

STIM Laboratory, University of Houston

y = \left(\sqrt{x+1} - \sqrt{x}\right) \left( \frac{\sqrt{x+1} + \sqrt{x}}{\sqrt{x+1} + \sqrt{x}} \right)
y = \left( \frac{x+1-x}{\sqrt{x+1} + \sqrt{x}} \right)
y = \frac{1}{\sqrt{x+1} + \sqrt{x}}

Original Expression:

y = \sqrt{11} - \sqrt{10}
y = 3.31662 - 3.16228
y = 3.32 - 3.16 = 0.160

Reformulation:

y = \frac{1.00}{3.32 + 3.16}
y = \frac{1.00}{6.48} = 1.54321

Expected Result:

y = 0.154347 = 0.154
\epsilon_r = \frac{|0.154 - 0.160}{|0.154|} \approx 3.9\%
\epsilon_r = 0\%

Quadratic Equation

Loss of Precision

Reformulation

Implementation

David Mayerich

STIM Laboratory, University of Houston

Quadratic Equation

  • Find the roots of a polynomial \(ax^2+bx+c=0\)

  • Quadratic Equation:

David Mayerich

STIM Laboratory, University of Houston

x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}
  • Two cases cause computational problems:

    1. If \(a=0\), then the equation is undefined (the polynomial isn't quadratic)

    2. If \(4ac \ll b^2\) then catastrophic cancellation can occur in the numerator

Two cases for cancellation:

  • if \(b>0\) then \(-b+\sqrt{b^2-4ac}\) can result in cancellation

  • if \(b<0\), then \(b-\sqrt{b^2-4ac}\) can result in cancellation

Testing the Quadratic Equation

  • Consider the roots of the polynomial \(x^2 + 182^3x-1\) (using actual floating point):

David Mayerich

STIM Laboratory, University of Houston

x_1 = \frac{-b - \sqrt{b^2-4ac}}{2a}
x_2 = \frac{-b + \sqrt{b^2-4ac}}{2a}
x_1 = \frac{-182^3 - \sqrt{182^6-4}}{2}
x_2 = \frac{-182^3 + \sqrt{182^6-4}}{2}
  • Performing this calculation using IEEE 754 floating point, we get:

\epsilon_r \approx 9.5\%

binary64

x_1 = -6.0286\times 10^8
x_1 = -1.657865 \times 10^{-7}

binary32

x_1 = -6.0286\times 10^8
x_1 = -1.5 \times 10^{-7}

Reformulating the Quadratic Equation

  • When \(b \gg 4ac\) there is one very small root

  • Assume that \(b>0\) and reformulate the "small" root:

David Mayerich

STIM Laboratory, University of Houston

x_2 = \frac{-b + \sqrt{b^2-4ac}}{2a}
x_2 = \frac{-b + \sqrt{b^2-4ac}}{2a} \cdot \frac{-b - \sqrt{b^2-4ac}} {-b - \sqrt{b^2-4ac}}
x_2 = \frac{b^2 - (b^2-4ac)} {2a \left( -b-\sqrt{b^2-4ac} \right)}
x_2 = \frac{-2c} { b+\sqrt{b^2-4ac}}

Simplifying the Reformulation

  • For the case where \(b > 0\) we have two roots:

David Mayerich

STIM Laboratory, University of Houston

-2ax_1 = b + \sqrt{b^2-4ac}
x_1 = \frac{-b - \sqrt{b^2-4ac}}{2a}

(large root)

x_2 = \frac{-2c} { b+\sqrt{b^2-4ac}}

(small root)

x_2 = \frac{-2c} { -2ax_1}
x_1 = \frac{-b - \sqrt{b^2-4ac}}{2a}
x_2 = \frac{c} {ax_1}

Implementing the Quadratic Equation

  • Consider the reformulation of both cases for \(b\):

David Mayerich

STIM Laboratory, University of Houston

x_1 = \frac{-b - \sqrt{b^2-4ac}}{2a}
x_2 = \frac{c} {ax_1}
b>0
b\leq0
x_1 = \frac{-b + \sqrt{b^2-4ac}}{2a}
x_2 = \frac{c} {ax_1}
q= -\frac{1}{2}\left[b+\text{sgn}(b)\sqrt{b^2 - 4ac} \right]
\text{where}\quad \text{sgn}(x)= \begin{cases} -1 & \text{if}\ x<0\\ 1 & \text{if}\ x \geq 0 \end{cases}
x_1 = \frac{q}{a}
x_2 = \frac{c}{q}

Series Approximations

Preserving Precision

Mean Value Theorem

David Mayerich

STIM Laboratory, University of Houston

Approximating Special Functions

  • Identify and fix the catastrophic cancellation in the following expression at \(x=0.123\):

David Mayerich

STIM Laboratory, University of Houston

y = \text{sinc}(x) - \cos(x)

Calculate expansions at \(c=0\):

\sin(x) \approx x - \frac{x^3}{6}
\text{where}\ \ \text{sinc}(x) =\frac{\sin(x)}{x}
\cos(x) \approx 1 - \frac{x^2}{2}
y=\frac{x - \frac{x^3}{6}}{x} - 1 + \frac{x^2}{2}
y=1 - \frac{x^2}{6} - 1 + \frac{x^2}{2}
y=-\frac{x^2}{6} + \frac{3x^2}{6}
y=\frac{x^2}{3}

Reformulate:

Testing the Reformulation

  • Evaluate \(y=\text{sinc}(x) - \cos(x)\) at \(x=0.123\):

David Mayerich

STIM Laboratory, University of Houston

Original Expression:

Reformulation:

Expected Result:

y = 0.00503537 = 0.00504
y = \frac{0.12269}{0.123} - 0.992445
y = \frac{0.123}{0.123} - 0.992
y = 1.00 - 0.992 = 0.008
\frac{\sin(0.123)}{0.123} - \cos(0.123)
y=\frac{(0.123)^2}{3}
y=\frac{0.015129}{3}
= \frac{0.0151}{3}
y =.00503333
=.00503
\epsilon_r = \frac{|0.008-0.00504|}{|0.00504|} \approx 58.7\%
\epsilon_r = \frac{|0.00503-0.00504|}{|0.00504|} \approx 0.2\%

Improve Precision Loss (Part II)

  • Mitigate precision loss in the expression \(y=e^x - e^{-2x}\)

David Mayerich

STIM Laboratory, University of Houston

e^x = 1 + x + \frac{x^2}{2} + \frac{x^3}{6}\cdots
y=\left(1 + x + \frac{x^2}{2} + \frac{x^3}{6}\cdots \right)- \left( 1 -2x + \frac{4x^2}{2} - \frac{8x^3}{6}\cdots \right)
y=3x - \frac{3x^2}{2} + \frac{9x^3}{6} \cdots

Mean Value Theorem

If \(f\) is a continuous function on the closed interval \([a,b]\) and possesses a derivative at each point of the open interval \((a,b)\), then

David Mayerich

STIM Laboratory, University of Houston

f(b)-f(a) = (b-a)f'(\theta)

for some \(\theta\) in \((a,b)\). A basic approximation is \(\theta=\frac{a+b}{2}\).

  • Mitigate catastrophic cancellation in the expression \(y=e^x - e^{x-\epsilon}\):

f(z)= e^z
b=x
a=x-\epsilon
f'(z)=e^z
\theta=\frac{2x-\epsilon}{2}
(b-a)f'(\theta)
(x-x+\epsilon)e^{\frac{2x-\epsilon}{2}}
\epsilon e^{\frac{2x-\epsilon}{2}}

Test precision for \(x=2\) and \(\epsilon=0.01\):

e^2-e^{1.99}
7.39-7.32=0.07
\epsilon e^{\frac{2x-\epsilon}{2}}
0.01 e^{\frac{3.99}{2}}
0.01 e^{2} = 0.0739

Original Expression:

Reformulation:

Expected Results:

y=0.0735

Mean Value Theorem Examples

  • Reformulate \(y=\ln(x+\epsilon)-\ln x\)

David Mayerich

STIM Laboratory, University of Houston

  • Reformulate \(y=e^\epsilon - 1\)

f(b)-f(a) = (b-a)f'(\theta)
f(z)=\ln z
a=x
b=x+\epsilon
f'(z)=\frac{1}{z}
\theta = \frac{2x+\epsilon}{2}
(b-a)f'(\theta)
(x+\epsilon-x)\frac{1}{\frac{2x+\epsilon}{2}}
y=\frac{2\epsilon}{2x+\epsilon}
f(z)=e^z
a=0
b=\epsilon
f'(z)=e^z
\theta = \frac{\epsilon}{2}
(\epsilon-0)e^{\frac{\epsilon}{2}}
\epsilon e^{\frac{\epsilon}{2}}

C.2 Catastrophic Cancellation

By STIM Laboratory

C.2 Catastrophic Cancellation

  • 188