Jaime Arias, Pierre-Antoine Bouttier, Marcelo Forets
General principle
• As in a marathon, an intensive effort...
• ...to try to solve a problem...
• Implying several people (and a competition).
In real life
• A team (and no competition)...
• ...to try to solve a problem (generally in computer science)...
• ... with a limited amount of time.
Originally, the exercice was proposed by GENCI.
Details
• Around each national or regional HPC center
• A team of students/researchers/engineers
• 2 days
• To transform/optimize a real numerical code to run it efficiently on HPC clusters
Jaime Arias (Research engineer, Mistis, Inria)
Marcelo Forets (Post-doc researcher, Tempo, Verimag)
Pierre-Antoine Bouttier (Research engineer, GriCAD)
• A proposition (among others) from L. Simula (Pr. at ENS Lyon)
• A code that helps to study what are the mechanisms to find an optimal income tax in the context of 2 countries playing Nash game.
Etienne Lehmann, Laurent Simula, Alain Trannoy; Tax me if you can! Optimal Nonlinear Income Tax Between Competing Governments, The Quarterly Journal of Economics, Volume 129, Issue 4
The code was written in the Mathematica language
...We were (and, in fact, also are) not specialists
of these scientific research fields!
• To rewrite the code in a more "HPC friendly" language...
• ...With the ulterior motive to make it easier to develop and run.
• To run it on a GriCAD HPC cluster and to see how we made an incredible work in improving greatly its performances (principally run time).
Obstacles : Time, unknown (for us) scientific context, symbolic to numerical computing.
• Translation of the original code in Python
• Test and refactoring the new code
• Optimizing the performance (CPU time)
• Crossing the fingers (all along the process, in fact)
We have produced a Python code which:
• Gives closed numerical results than the original code
• Runs on the Froggy machine (a GriCAD HPC cluster)
• Exploits multiple cores (placed on a unique node)
• Goes 20 times faster than the original code
git clone https://gricad-gitlab.univ-grenoble-alpes.fr/bouttiep/hackathon2017.git
Now, we focus on some specific aspects of this exercise:
• What are the collaborative tools that we have used?
• What was our strategy about testing and optimizing our code?
• What have we learnt?
Some desirable features:
Real-time collaborative LaTeX edition
Jupyter notebooks with chat embedded
(Collaborative Calcuation in the Cloud)
Linux terminal
(how we actually launched our app)
Joblib: running Python functions as Python jobs on several cores
Aim: to provide tools to easily achieve better performance and productivity when working with long running jobs
Easy to use! pip installable; docs & examples easy to google
Results in a 8
cores node (Froggy)
Parallel code
Serial code
Mathematica code:
Python code:
# cores | Mathematica | Python |
---|---|---|
1 | ~107 s | ~14 s |
4 | - | ~6 s |
Short answer: a lot of things.
• Collaborative tools and real-time uses (CoCalc, git, python notebooks, markdown/latex editing)
• Symbolic to numerical computing
• Hardware and software architecture of a HPC cluster
Short answer: how people work
• We did not know each other
• We do not have the same scientific or technical interests
• We felt that our work was useful
• We had a good time
Maybe a good idea to renew and promote this kind of exercise
• Locally (in the Grenoble-Alpes university environment)
• Regularly (once a year?)
• To adress to all scientific and technical communities