Ways to Improve the Program Performance

Outline

  • Compiler Optimization
  • Library Optimization
  • Instruction Set Optimization
  • Nodes Optimization
  • Thread Affinity

 

  • Tool: Vtune

Compiler Optimization

Common Compiler Flags

  • Optimize for maximum speed
  • -O1
    • optimizations that increase size for a small benefit
  • -O2(default)
  • -O3
    • enable more aggressive optimizations that may not improve performance on some programs
  • -Ofast
    • might not be safe for all programs

Library Optimization

eg. MPI & Numerical Libraries

MPI

  • Message Passing Interface
  • Communication protocol for parallel computers
  • Implementations
    • Intel MPI
    • OpenMPI
    • MPICH

Numerical Libraries

  • FFTW
    • for computing discrete Fourier transforms
  • MKL
    • a library of optimized math routines 
    • linear algebra, fast fourier transforms, vector math...
  • Example
    • DGMX_FFT_LIBRARY=xxx
    • Use FFTW3, MKL libraries for FFT support

Instruction Set Optimization

eg. SIMD

SIMD

  • Single Instruction Multiple Data
  • Same instruction is applied to many data streams
    • eg. add 64 numbers by sending 64 data streams to 64 ALUs to form 64 sums within a single clock cycle
  • Instruction Sets
    • SSE
    • AVX256
    • AVX512

Nodes Optimization

The more, the better?

Scaling is limited

  • Just like team work

Thread Affinity

Thread Affinity

  • assign specific threads to a particular processor/core

Tool: Vtune

Vtune

Examples

Reference

Made with Slides.com