Extending Python with Go

By Andrea Stagi, CTO @ Lotrèk

Napoli @ PAN - 15/09/2018

🐍 + πŸ• = ❀️

The problem











S1 RC: Microservice that exports an API containing product images and pharmacies statistics.

S2 WW: The main website fetching S1 CR exported informations.

S3 PN: Admin panel for website and S1 RC

We need to speed up our Python Cron jobs

Switch from
Python to $a_performant_language progressively

Extending Python with C


Let's create a newmath module with sum function

from newmath import sum

print (sum(5,4))
#define Py_LIMITED_API
#include <Python.h>

static PyObject *sum(PyObject *self, PyObject *args) {
    const long a, b;

    if (!PyArg_ParseTuple(args, "LL", &a, &b))
        return NULL;

    return PyLong_FromLong(a + b);

static PyMethodDef MathMethods[] = {
    {"sum", sum, METH_VARARGS, "Add two numbers."},
    {NULL, NULL, 0, NULL}

static struct PyModuleDef newmathmodule = {
   PyModuleDef_HEAD_INIT, "newmath", NULL, -1, MathMethods

PyMODINIT_FUNC PyInit_newmath(void) {
    return PyModule_Create(&newmathmodule);

Compile time! πŸ“¦

This will generate newmath.so

gcc newmath.c -shared -o newmath.so
`pkg-config --cflags --libs python3`

from newmath import sum

print (sum(5,4))

Easy to import

Why Go and not C?

Go is easier than C

Garbage Collector


Go routines ❀️

Extending Python with Go

Please, welcome CGO

CGO is an amazing technology which allows Go programs to interoperate
with C.

We use the magic C.* namespace to access anything from the
C world

package main

// #cgo pkg-config: python3
// #include <Python.h>
// int PyArg_ParseTuple_LL(PyObject *, long long *, long long *);
import "C"

import (

//export sum
func sum(self, args *C.PyObject) *C.PyObject {
    var a, b C.longlong
    if C.PyArg_ParseTuple_LL(args, &a, &b) == 0 {
        return nil
    return C.PyLong_FromLongLong(a + b)

πŸ“„ newmath.go

What's declared before import "C"?

package main

// #cgo pkg-config: python3
// #include <Python.h>
// int PyArg_ParseTuple_LL(PyObject *, long long *, long long *);

import "C"

πŸ“„ newmath.go

What isΒ  PyArg_ParseTuple_LL?

This is not declared in Python.h πŸ€”

// int PyArg_ParseTuple_LL(PyObject *, long long *, long long *);
import "C"

πŸ“„ newmath.go

CGO doesn't support variadic functions so we need to wrap PyArg_ParseTuple in C code

#define Py_LIMITED_API
#include <Python.h>

int PyArg_ParseTuple_LL(
    PyObject * args, 
    long long * a, 
    long long * b
) {
    return PyArg_ParseTuple(args, "LL", a, b);

πŸ“„ newmath_utils.c

Let's compile this

go build -buildmode=c-archive -o libnewmath.a


Our header we need to include in our .c file before compiling. It contains our function definitions and other stuff.


Our built archive. We need to link it during final compilation using
-L . -lnewmath flags.

// ...

extern PyObject* sum(PyObject* p0, PyObject* p1);

πŸ“„ libnewmath.h

Buildmode c-archive

We need to include libnewmath.h somewhere

#define Py_LIMITED_API
#include <Python.h>
#include "libnewmath.h"

static PyMethodDef NewMathMethods[] = {
    {"sum", sum, METH_VARARGS, "Add two numbers."},
    {NULL, NULL, 0, NULL}

static struct PyModuleDef newmathmodule = {
   PyModuleDef_HEAD_INIT, "newmath", NULL, -1, NewMathMethods

PyMODINIT_FUNC PyInit_newmath(void) {
    return PyModule_Create(&newmathmodule);

πŸ“„ _newmath.c

A better and simpler approach

Move all the Py stuff into C and just call the Go function

package main

import "C"

//export sum
func sum(a int, b int) int {
    return (a + b)

πŸ“„ newmath.go

#define Py_LIMITED_API
#include <Python.h>
#include "libnewmath.h"

PyObject *sum_wrapper(PyObject *obj, PyObject *args) {
    const long a, b;

    if (!PyArg_ParseTuple(args, "LL", &a, &b))
        return NULL;

    return PyLong_FromLong(sum(a, b));

static PyMethodDef NewMathMethods[] = {
    {"sum", sum_wrapper, METH_VARARGS, "Add two numbers."},
    {NULL, NULL, 0, NULL}

// ...

πŸ“„ _newmath.c

Final step

gcc _newmath.c -shared -o newmath.so
`pkg-config --cflags --libs python3` -L . -lnewmath

Stop talking!
Let's code!

Pay attention!

CGO is not Go

Runtime overhead

Calling Go from a different runtime spins up the Go runtimeΒ 

Also vice versa


//export sayHello
func sayHello(message *C.char) *C.char {
    return C.CString(
        fmt.Sprintf("Hello %v", C.GoString(message))


//export sayHello

func sayHello(message string) string {
    return fmt.Sprintf("Hello %v", message)
// ...
PyObject * _say_hello(PyObject *obj, PyObject *args) {
    PyObject *py_retval;
    char *path;

    if (!PyArg_ParseTuple(args, (char *) "s", &path)) {
        return NULL;
    GoString gostr = {p: path, strlen(path)};
    GoString retval = sayHello(gostr);
    py_retval = Py_BuildValue((char *) "s", retval.p);
    return py_retval;

πŸ“„ _hello.c

πŸ“„ hello.go

Runtime error!
cgo result has go pointer

Boost with export

(Don't try this at 🏑)

Variadic functions

#define Py_LIMITED_API
#include <Python.h>

// ...

int PyArg_ParseTuple_O(PyObject * args, PyObject ** o) {
    return PyArg_ParseTuple(args, "O", o);


#define Py_LIMITED_API
#include <Python.h>

// <Pylib>/3.6/include/python3.6/listobject.h

int is_a_list(PyObject * p) {
    return PyList_Check(p);

int is_a_long(PyObject * p) {
    return PyLong_Check(p);

πŸ“„ _macro.c

Running in parallel

Dealing with the GIL

import time
from threading import Thread

COUNT = 50000000

def countdown(n):
    while n > 0:
        n -= 1
    print ('Done! My final value is {0}'.format(n))

t1 = Thread(target=countdown, args=(COUNT/2,))
t2 = Thread(target=countdown, args=(COUNT/2,))

start = time.time()
end = time.time()

print('Time taken in seconds -', end - start)

It takes

~ 6.18 seconds

Same code in Go using Go routines


func Countdown() {
    var wg sync.WaitGroup
    for i := 0; i < 2; i++ {
        go func(n uint) {
            defer wg.Done()
            for n > 0 {
                n -= 1
            fmt.Println("Done! My final value is ", n)
        }(50000000 / 2)

It takes

~ 0.02 seconds

(executed from Python 🐍)

Go routines are also lighter than Threads

Case study

Resize and optimize images

In our project we have a simple Cron job written in Python for images

A lot of products images comes everyday from different sources

And we process them using
Pillow for resize and
pngquantΒ + jpegoptim
for optimization

foreach img in a_folder:
Β  Β  Β dest = convert (img)
Β  Β  Β optimize (dest)

We had something written in Go...

Long time aGo...

We created Piuma! https://github.com/piumaio


Send Image_url with parameters w=100 and h=100

Serve the resized image

Resize and optimize the image to 100 x 100 or get from the cache


Profile our Python code

Pay attention to PIL.resize and convert

Rewrite it with Go using Piuma

Profile our Go code creating a different main

package main

import "C"

import (

func main() {
    f, err := os.Create("./piumago.profile")
    if err != nil {
    defer pprof.StopCPUProfile()
    cs := C.CString("../images")
    defer C.free(unsafe.Pointer(cs))
    OptimizeFromDirWrapper(cs, 100, 50);

jpeg.Decode is really slow!

Alternative jpeg library, compiled with libturbo-jpeg


Profile it again

And now...

the final demo!




πŸ“‹ slides.com/andreastagi/pygo
πŸ’» github.com/astagi/pygoexamples

🌈 github.com/astagi/pypiuma

πŸ“š Part 1 is on Medium:


Extending Python with Go

By Andrea Stagi

Extending Python with Go

Extending Python with Go

  • 1,735