GAJIT

Symbolic Optimisation and JIT Compilation of Geometric Algebra in Python with GAALOP and Numba

 

Hugo Hadfield,  Dietmar Hildenbrand, Alex Arsenovic

2019

hh409.user.srcf.net

Existing GA Implementations

GARAMON

Versor

Gaigen

GAALOP

Clifford (Python)

Matlab GA Toolbox

Ganja.js

 

And many others!

Low level implementations are fast to run but slow to write

 

 

High level implementations are fast to write, fun to work with, but typically slow to run

Python

  • Swifty becoming THE STANDARD for scientific computing
  • Clean and easy syntax
  • Huge number of third party libraries
  • Good interoperability with Julia, Fortran, C++ etc.
  • Free and open source

The Python GA ecosystem

  • GitHub group: https://github.com/pygae/
  • Numerical library: https://github.com/pygae/clifford
  • Symbolic library: https://github.com/pygae/galgebra
  • Visualisation: https://github.com/pygae/pyganja

 

Lots of pieces in place for easy collaboration and prototyping

 


# Import Conformal GA (4,1)
from clifford.g3c import *

# Import prebuilt tools for conformal GA
from clifford.tools.g3c import *

# Import pyganja for visualisation
from pyganja import *

object_array = [
random_conformal_point(),
random_point_pair(),
random_circle(),
random_line(),
random_sphere(),
random_plane() 
]

# The simplest way to use pyganja
draw(object_array, 
    static=False , 
    scale=0.1)
from clifford.tools.g3c import *
from clifford.tools.g3c.rotor_parameterisation import *
from pyganja import *

# Create two objects
X1 = random_circle()
X2 = random_circle()

# Find the rotor between them
R = rotor_between_objects(X1, X2)

# Take the log of the rotor
logR = general_logarithm(R)

# Blend from 0 to 1
rotor_list = [general_exp(0.1*alpha*logR) 
              for alpha in range(10)]

# Apply the interpolated rotors
object_list = [R_int*X1*~R_int 
               for R_int in rotor_list]

# The simplest way to use pyganja
draw(object_list , 
    static=False , 
    scale=0.1)

Building Real Applications

  • Real industrial and commerical applications require fast numerics

 

  • Need to exploit modern computer hardware inc. graphics cards and highly parallel CPUs

 

  • How can we make our GA Python code fast enough for industry?
def example_one(X, R):
    Y = 1 + (R * X * ~R)
    Ystar = Y * (e1 ^ e2 ^ e3 ^ einf ^ e0)
    F = Ystar | (e1 ^ e2)
    return F

Fast code for no effort

We would like to accelerate this, compile it perhaps, to avoid run-time type checking overhead. Ideally with no code changes:

@make_this_faster
def example_one(X, R):
    Y = 1 + (R * X * ~R)
    Ystar = Y * (e1 ^ e2 ^ e3 ^ einf ^ e0)
    F = Ystar | (e1 ^ e2)
    return F

Accelerating Python

  • Pypi - An implementation of the Python language with a built in JIT compiler. Does not intergrate well with C/C++ libraries

 

  • Cython - Accelerates Python code via addition of type information and C code generation. Have to modify source code significantly.

 

  • Numba - LLVM backed JIT compiler, designed to work specifically with Numpy. Requires no code modification, has a CUDA interface.

Numba JIT Compiler

  • Requires no source code modification
  • Automatic algorithm parallelisation
  • GPU support
  • Very fast
  • Does not support classes well yet

 

import numba
import random

@numba.njit
def monte_carlo_pi(nsamples):
    acc = 0
    for i in range(nsamples):
        x = random.random()
        y = random.random()
        if (x ** 2 + y ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples

Time for 100,000,000
Pure Python: 76.67s
Numba JITed: 1.87s

41x faster

Numba Limitations

@numba.njit  # THIS DOES NOT WORK!
def example_one(X, R):
    Y = 1 + (R * X * ~R)
    Ystar = Y * (e1 ^ e2 ^ e3 ^ einf ^ e0)
    F = Ystar | (e1 ^ e2)
    return F

Have to hand optimise to get fast code

Class support is not yet good enough for rich multivector classes

@numba.njit  # Efficient but ugly
def example_one_faster_jit(X, R):
    Y = gp(gp(R, X), rev(R))
    Y[0] += 1
    Ystar = gp(Y, e12345_value)
    F = ip(Ystar, e12_value)
    return F

Embed GAALOPScript in Python

# MAYBE WE CAN DO SOMETHING LIKE THIS ?

@GAALOP 
def example_one(X, R):
    Y = 1 + (R * X * ~R);
    Ystar = Y * (e1 ^ e2 ^ e3 ^ einf ^ e0);
    F = Ystar.(e1 ^ e2);
    ?F;

GAALOPScript has a pretty similar syntax to Clifford Python, lets just use that!

Would also let us use existing GAALOPScript code in Python without having to rewrite the code :)

Communicating with GAALOP

  1. Compile GAALOP in CLI (Command Line Interface) mode
  2. Write Python code to analyse source code transfer GAALOPScript to GAALOP
  3. Receive C code from GAALOP and either wrap it via a ctypes interface or transliterate it back to Python

THE TRANSLITERATED CODE WE GET BACK FROM GAALOP IS JITTABLE WITH NUMBA!

An Example

  • In reality embedding GAALOPScript requires quite a bit of boilerplate code

 

  • 30us for pure python equivalent
  • 3.4us for this

 


from gajit import *
from clifford.tools.g3c import random_conformal_point, random_sphere

# The GAALOPScript function goes here
body = """
R = *(A ^ C ^ einf);
PP = R ^ (*S);
hasIntersection = PP.PP; 
"""

# Create a python function from that 
# GAALOPScript function
raytrace = Algorithm(
    inputs = ['S', 'A', 'C'],
    outputs = ['hasIntersection'],
    intermediates = ['R','PP'],
    body=body,
    function_name="raytracer",
    blade_mask_list = [layout.grade_mask(4),
                   layout.grade_mask(1),
                   layout.grade_mask(1)]
    )
    
# We can now call the algorithm 
# from python, passing in python 
# multivectors and getting a python 
# multivector back
S = random_sphere()
A = random_conformal_point()
C = random_conformal_point()
result = raytrace(S,A,C)

Python to GAALOPScript translation

@gajit([layout.grade_mask(4),layout.grade_mask(1),layout.grade_mask(1)])
def clifford_raytrace_gajit(S,A,C):
    PP = ((e12345 * (A ^ C ^ einf)) ^ (e12345 * S))
    hasIntersection = (PP | PP)
    return hasIntersection

GAALOPScript has a pretty similar syntax to Clifford Python, maybe we can just translate it automatically?

We can pass grade masks on the inputs to ensure only the minimum number of symbols are generated

It Works!

from clifford.g3c import *
from clifford.tools.g3c import *
from gajit import *


@gajit([layout.grade_mask(4),layout.grade_mask(1),layout.grade_mask(1)])
def clifford_raytrace_gajit(S,A,C):
    PP = ((e12345 * (A ^ C ^ einf)) ^ (e12345 * S))
    hasIntersection = (PP | PP)
    return hasIntersection

S = random_sphere()
A = random_conformal_point()
C = random_conformal_point()
hit = clifford_raytrace_gajit(S,A,C)

Not only does this allow us to use GAALOP as a precompiler for Python, it also makes Python a valid front end for GAALOP

Limitations

  • Richness of the GAALOPScript language
  • Nested functions
  • Canonical blade ordering differences
  • JIT time for large functions
  • Currently only CGA
  • Difficult installation

Galgebra

  • Python symbolic GA manipulation via Sympy
  • Very powerful and feature rich GA manipulation with a syntax very close to clifford
  • Gajit now also supports Galgebra as a symbolic backend, very easy to install, and with lots of potential for new features
from clifford.g3c import *
from clifford.tools.g3c import *
from gajit import *

@gajit([layout.grade_mask(4),layout.grade_mask(1),layout.grade_mask(1)],
        galgebra=True)
def clifford_raytrace_gajit(S,A,C):
    PP = ((e12345 * (A ^ C ^ einf)) ^ (e12345 * S))
    hasIntersection = (PP | PP)
    return hasIntersection

S = random_sphere()
A = random_conformal_point()
C = random_conformal_point()
hit = clifford_raytrace_gajit(S,A,C)
  • One keyword argument change
  • Generates smaller and faster code 
  • Would be interesting to play with its differential operators too...

Galgebra + Gajit

Gajit:

Geometric Algebra Just In Time Compiler

@gajit([layout.grade_mask(4),layout.grade_mask(1),layout.grade_mask(1)])
def clifford_raytrace_gajit(S,A,C):
    PP = ((e12345 * (A ^ C ^ einf)) ^ (e12345 * S))
    hasIntersection = (PP | PP)
    return hasIntersection

Requires no changes to source code. Generates Highly efficient implementations.

Can target multiple output languages/frameworks.