Python in Space Science

Daniel Vagg

System Architect @ Parameter Space

 

MSc Physics (Space Science and Technology)

 

 

Lightning talk overview

 

  • Quick intro/Parameter Space
     
  • Gaia and the need for a platform
  • The GAVIP platform
  • Some interesting parts
     
  • What we're doing now
     
  • Presentations rarely end on time

2 min

3 min

9 min
(TOTAL)

2 min

2 min

Quick Intro

  • I'm Dan
     
  • I design platforms for space science
  • I send things to ~space (stratosphere)

Parameter Space

  • Spin-out company from UCD School of Physics
     
  • Started with contract to build a platform for Gaia
    • Gaia: ESA satellite observing our galaxy
    • Platform: Allow scientists to analyse data without downloading it
       
  • Working on similar systems now

Gaia

  • Scans 1 billion stars in our galaxy using 2 telescopes
    • Determine their position/motion very accurately
  • Observations on either side of the sun helps determine distance

A lot of data (> 1PB)

Scientists want to analyse it

Move code to data

http://www.esa.int/Our_Activities/Space_Science/Gaia

Why build a platform (GAVIP)?

Platform objectives

  1. Enable user-contributed code to run close to the Gaia archive
    Not so bad..
     
  2. Support the reuse of these codes by others
    A different beast
     
  3. Support the sharing of results

The GAVIP platform (almost entirely Python):

AVI: Added Value Interface (~someones code)

GAVIP: Gaia AVI Platform

Some interesting parts

  • Pipelines defined by code
    • Luigi (SciLuigi)
       
  • Code isolation + integration
    • Docker containers (Docker-py)
    • Built-in Django framework
       
  • Command line client (auto generated)
    • Django Rest Framework
      • + Django Rest Swagger
    • Click

 

Some interesting parts

[Luigi] Define pipelines in Python

https://luigi.readthedocs.io/en/stable/tasks.html

Some interesting parts

[Django/Docker/Docker-py] Isolate and integrate AVIs

  • Anaconda suite
  • AstroPy
  • Luigi
  • Celery
  • Django

Some interesting parts

[DRF+DRS/Click] Generate a CLI using the generated API docs

  • Django Rest Framework
    • Simplified making a REST API to the platform
  • Django Rest Swagger
    • Automatically generated Swagger docs
       
  • Click is a neat library for making CLI tools in Python

     

Using click+requests, we pulled the Swagger JSON and generated almost all of the CLI functions after login

Also: Asciinema is great!

What we're working on now

EO4Atlantic: In early prototyping stages

  • A data analysis platform for latest Earth Observation data (Atlantic region)
  • 2 Petabytes of data per month!

Interesting challenges:

  • How to support compute requirements
  • Moving around data
  • Avoiding duplication in analysis/results

EO4Atlantic

The end

Fancy a chat about anything -> I'm around
 

Daniel Vagg

dan@parameterspace.ie

daniel.vagg@parameterspace.ie

Made with Slides.com