CPSC 355: Tutorial 17
Floating Point Operations
PhD Student
Fall 2017
Floating Point Arithmetic
ARMv8 has 32 floating point registers (sort of, they're actually 128bit, but ignore that for now)
- d0-d31 are 64 bit double precision registers
- s0-s31 are the lower 32 bits of these registers and are used for single precision FP operations
- d8-d15 are callee saved registers
- d0-d7 and d16-d31 may be overwritten by subroutines.
- d0-d7 are used to pass floating point arguments into a function
Floating Point Arithmetic
An example function...
double approximate(double x, double y, int iterations) {
// d0 contains x
// d1 contains y
// w0 contains iterations
// do some calculations
// return value goes in d0
}
double approximate(int iterations, double x, double y) {
// d0 contains x
// d1 contains y
// w0 contains iterations
// do some calculations
// return value goes in d0
}
Floating Point Arithmetic
An example function...
int approximate(double x, double y, int iterations) {
// d0 contains x
// d1 contains y
// w0 contains iterations
// do some calculations
// return value goes in w0
}
int approximate(int iterations, double x, double y) {
// d0 contains x
// d1 contains y
// w0 contains iterations
// do some calculations
// return value goes in w0
}
Floating Point Arithmetic
- fmov S/Dd, S/Dn: move S/Dn into S/Dd
- fmov S/Dd, #fpimm: move immediate value into S/Dd (pretty limited must be of the form \(\pm n / 16 \times 2^r\) where \(n \in [16, 31]\) and \(r \in [-3, 4]\))
- fcvt Sd, Dn: Convert Dn to single prec. Sd
- fcvt Dd, Sn: Convert Sn to double prec. Dd
- fadd, fsub S/Dd, S/Dn, S/Dm: Add or subtract S/Dn, S/Dm, place result in S/Dd
- fmul S/Dd, S/Dn, S/Dm: S/Dd = S/Dn * S/Dm
- fdiv S/Dd, S/Dn, S/Dm: S/Dd = S/Dn / S/Dm
- fmadd S/Dd, S/Dn, S/Dm, S/Da: S/Dd = S/Da + S/Dn * S/Dm
Floating Point Arithmetic
- fabs S/Dd, S/Dn: S/Dd = abs(S/Dn)
- fneg S/Dd, S/Dn: S/Dd = -(S/Dn)
- fcmp S/Dn, S/Dm: Compares two registers, stores result in condition flags
- fcmp S/Dn, #fpimm: Compares register and immediate value, stores result in condition flags
An example...
Taylor Series
Recall the taylor series of an infinitely differentiable function (i.e. holomorphic or analytic function)
f(x) = \displaystyle\sum^\infty_{n=0} \frac{f^{(n)}(a)}{n!}(x-a)^n
Or its Maclaurin expansion
f(x) = \displaystyle\sum^\infty_{n=0} \frac{f^{(n)}(0)}{n!}x^n
For \(e^x\) we have
e^x = \displaystyle\sum^\infty_{n=0} \frac{x^n }{n!}
Taylor Series
We can stop after some amount of terms in the sum
e^x \approx \displaystyle\sum^j_{n=0} \frac{x^n }{n!}
In a machine, eventually we start to lose precision, so we stop after the terms stop decreasing the approximation beyond a threshold
Taylor Series
#include <math.h>
#include <stdio.h>
double taylor_exp(double x) {
double n = 0;
double nfact = 1;
double exp, exp_old;
double xx = 1;
exp = 1.0;
do {
n += 1.0;
nfact *= n;
xx *= x;
exp_old = exp;
exp += xx/nfact;
} while(fabs(exp_old - exp) > 1e-10);
return exp;
}
int main(int argc, char *argv[]) {
printf("%f\n", taylor_exp(0.0));
printf("%f\n", taylor_exp(1.0));
printf("%f\n", taylor_exp(2.0));
}
Taylor Series
.data
fmt1: .string "%f\n"
threshold: .double 0r1e-10 // A double prec. constant
.text
.balign 4
.global main
main:
stp x29, x30, [sp, -16]!
mov x29, sp
fmov d0, xzr
bl taylor_exp
adrp x0, fmt1
add x0, x0, :lo12:fmt1
bl printf
fmov d0, 1.0
bl taylor_exp
adrp x0, fmt1
add x0, x0, :lo12:fmt1
bl printf
fmov d0,2.0
bl taylor_exp
adrp x0, fmt1
add x0, x0, :lo12:fmt1
bl printf
ldp x29, x30, [sp], 16
ret
Taylor Series
taylor_exp: // d0 is the first argument
n_s = 16
nfact_s = 24
exp_s = 32
exp_old_s = 40
xx_s = 48
stp x29, x30, [sp, -64]! // Standard stack setup/teardown
mov x29, sp
fmov d1, xzr // Convert 0 to a double
str d1, [x29, n_s] // n = 0.0
fmov d1, 1.0
str d1, [x29, nfact_s] // nfact = 1.0
str d1, [x29, xx_s] // xx = 1.0
str d1, [x29, exp_s] // exp = 1.0
exp_do_top:
ldr d1, [x29, n_s] // loading and storing to addrs works just as it does for x
fmov d2, 1.0
fadd d1, d1, d2 // d1 = d1 + 1
str d1, [x29, n_s] // n = d1
// Build up the factorial over each iteration of the loop
ldr d2, [x29, nfact_s]
fmul d2, d2, d1
str d2, [x29, nfact_s] // nfact = nfact * n = n * (n-1)!
ldr d3, [x29, xx_s]
fmul d3, d0, d3
str d3, [x29, xx_s] // xx = x*xx = x^n
Taylor Series
ldr d4, [x29, exp_s]
str d4, [x29, exp_old_s] // exp_old = exp
fdiv d2, d3, d2 // d2 = x^n / nfact
fadd d5, d4, d2 // d5 = exp + x^n/nfact
str d5, [x29, exp_s] // exp = d5
ldr d1, [x29, exp_s]
ldr d2, [x29, exp_old_s]
fsub d1, d1, d2 // d1 = exp - exp_old
fabs d1, d1 // d1 = fabs(exp - exp_old)
adrp x0, threshold // Load the addr of the threshold
add x0, x0, :lo12: threshold
ldr d2, [x0] // load threshold value into d2
fcmp d1, d2 // fabs(exp - exp_old) cmp threshold
b.gt exp_do_top // if it still contributes to the upper 10 digits, keep going
ldr d0, [x29, exp_s] // load ~exp(x) into d0 (return value)
ldp x29, x30, [sp], 64 // stack teardown
ret
Next Day
Work Day
CPSC 355: Tutorial 17
By Joshua Horacsek
CPSC 355: Tutorial 17
- 1,585