OpenHPC hands-on workshop

Andrés Díaz-Gil,

Head of the HPC and IT department at:

Iñigo Aldázabal,

Head of the HPC and IT department at:

Overview

  • Motivation
  • OpenHPC overview
  • OpenHPC Demo

Motivation

Once uppon a time... 

Spanish public research institutions used to invest in HPC hardware

But not so much in personnel to take care

Now situation is even worse:

No people and fewer machine updates


Many overwhelmed sysadmins

If you are in a situation and do not want to end like this:

Pay attention

HPC Resources

Not having a "big" cluster, but having "big" problems

The main resource of the IFT is the cluster HYDRA:

  • Made of 100 compute nodes
  • Infiniband network
  • LUSTRE filesystem

We met all conditions in motivation:

  • Non full-time dedicated HPC sysadmins
  • Propietary "cluster suite"
  • Heterogeneous cluster
  • New hardware needed

Do not touch if working policy

OpenHPC Overview

What is NOT

  • Is not a propietary stack of software

What is?

  • OpenHPC is a Linux Foundation Collaborative Project
  • Basically a repository that:
    • Provides a reference collection of open-source HPC software components
    • Awsome documentation and installation guide

OpenHPC - S/W components

*Source: Karl W. Schulz, SC16 Community BOF

Currently at version 1.3

BOS Supported: CentOS 7.4 and SLES12SP?

OHPC Main Advantages

  • All open-source, modern and community well-known software
  • Just by enabling a repository. No compilation. Totally reversible (ohpc suffix)
  • Components can be added/replaced/skipped by the ones of your choice
  • Awesome step-by-step Installation and configuration guide (Recipes) and automation scripts:

One Recipe for each OS flavor and Resource Manager (SLRUM, PBSPro) and CPU. Each one having an automation script.

input.local + recipe.sh == installed system

OpenHPC Demo

Aka: "How to have an HPC cluster working in 15min from scratch"

OpenHPC Typical Architecture

SLURM Manager

Warewulf: Bootstrap+VNFS

DHCP, TFTP

NFS (/home, /opt/ohpc/pub)

 

Diskless clients

Stateless Cluster

Demo Part 1

Base Os -> Complete Master

Demo Part 2

Build&Deploy Compute Image

Demo Part 3

Resource Manager: Startup&Run

OHPC Pros

  • All modern open-source software
  • Provided with templates and scripts
  • Awesome Installation Guides
  • Fast and easy to install
  • Stateless Cluster (Optional): Less energy consumption, Less hardware failures

Conclusions

OHPC Cons

  • Actually none

Conclusions

Some IMOs

  • Integration with a preexisting LUSTRE may be a bit more involved
  • I would like NIS (or the like) to be included in the guides/scripts

Copy of deck

By adgdt

Copy of deck

  • 332