OpenHPC hands-on workshop
Andrés Díaz-Gil,
Head of the HPC and IT department at:
Iñigo Aldázabal,
Head of the HPC and IT department at:
Overview
- Motivation
- OpenHPC overview
- OpenHPC Demo
Motivation
Once uppon a time...
Spanish public research institutions used to invest in HPC hardware
But not so much in personnel to take care
Now situation is even worse:
No people and fewer machine updates
Many overwhelmed sysadmins
If you are in a situation and do not want to end like this:
Pay attention
HPC Resources
Not having a "big" cluster, but having "big" problems
The main resource of the IFT is the cluster HYDRA:
- Made of 100 compute nodes
- Infiniband network
- LUSTRE filesystem
We met all conditions in motivation:
- Non full-time dedicated HPC sysadmins
- Propietary "cluster suite"
- Heterogeneous cluster
- New hardware needed
Do not touch if working policy
OpenHPC Overview
What is NOT
- Is not a propietary stack of software
What is?
- OpenHPC is a Linux Foundation Collaborative Project
- Basically a repository that:
- Provides a reference collection of open-source HPC software components
- Awsome documentation and installation guide
OpenHPC - S/W components
*Source: Karl W. Schulz, SC16 Community BOF
Currently at version 1.3
BOS Supported: CentOS 7.4 and SLES12SP?
OHPC Main Advantages
- All open-source, modern and community well-known software
- Just by enabling a repository. No compilation. Totally reversible (ohpc suffix)
- Components can be added/replaced/skipped by the ones of your choice
- Awesome step-by-step Installation and configuration guide (Recipes) and automation scripts:
One Recipe for each OS flavor and Resource Manager (SLRUM, PBSPro) and CPU. Each one having an automation script.
input.local + recipe.sh == installed system
OpenHPC Demo
Aka: "How to have an HPC cluster working in 15min from scratch"
OpenHPC Typical Architecture
SLURM Manager
Warewulf: Bootstrap+VNFS
DHCP, TFTP
NFS (/home, /opt/ohpc/pub)
Diskless clients
Stateless Cluster
Demo Part 1
Base Os -> Complete Master
Demo Part 2
Build&Deploy Compute Image
Demo Part 3
Resource Manager: Startup&Run
OHPC Pros
- All modern open-source software
- Provided with templates and scripts
- Awesome Installation Guides
- Fast and easy to install
- Stateless Cluster (Optional): Less energy consumption, Less hardware failures
Conclusions
OHPC Cons
- Actually none
Conclusions
Some IMOs
- Integration with a preexisting LUSTRE may be a bit more involved
- I would like NIS (or the like) to be included in the guides/scripts
Copy of deck
By adgdt
Copy of deck
- 369