Extensible compilers, reusable optimizations and LLVM

What is MLIR?

MLIR

  • extensible IR
  • based on a graph-like data structure
    • nodes = operations
    • edges = values
  • values are a result of exactly one operation or a block argument
  • Operations are contained in Blocks
  • Blocks are contained in Regions
  • Operations may contain Regions

Operations

Blocks

Region

Operations

%a = "foo"() : () -> (i32)
%bs:2 = "bar"(%a) : (i32) -> (f32, i32)
%c1, %c2 = "foo_bar"(%bs#0) : (f32) -> (f32, i32)

Block

^bb0 (%c : i64) :
    %0 = addi %c %c : i64
    return %0 : i64

Terminators

 cond_br(%cond)[^bb1, ^bb2(%v : index)] : (i1) -> ()

Functions

func @identity(%x: i64) -> i64 {
  return %x: i64
}

func @simple(i64, i1) -> i64 {
^bb0(%a: i64, %cond: i1):
  cond_br %cond, ^bb1, ^bb2
^bb1:
  br ^bb3(%a: i64)
^bb2:
  %b = addi %a, %a : i64
  br ^bb3(%b: i64)
^bb3(%c: i64):
  br ^bb4(%c, %a : i64, i64)
^bb4(%d : i64, %e : i64):
  %0 = addi %d, %e : i64
  return %0 : i64
}

Type System

  • Standard LLVM Types
  • memref type
    • `memref<16x32xf32>`, `memref<?xi32>`

Attributes

#col_major = affine_map<(d0, d1, d2) -> (d2, d1, d0)>
memref<16x32xf32, #col_major>

Dialects

  • Can define new operations and types
  • Multiple dialects as a part of the library
  • Standard, LLVM, Affine, Parallel, GPU, etc

Interfaces

  • Dialects have many different operations with different semantics
  • *downsides* MLIR transformations and analysis would either have to handle semantics of every operation or handle operation conservatively.
  • Generic way of interacting with IR
  • Analysis and Transformations operate on interface instead of specific operation.
  • Operations declare their own interfaces.

Dialect Interfaces

  • Interface over a collection of operations
  • Tied to a dialect
  • ex: InlinerInterface

Inliner Interface

class InlinerInterface : public
    DialectInterfaceCollection<DialectInlinerInterface> {
  virtual bool isLegalToInline(Region *dest, Region *src,
                               BlockAndValueMapping &valueMapping) const {
    auto *handler = getInterfaceFor(dest->getContainingOp());
    return handler ? handler->isLegalToInline(dest, src, valueMapping) : false;
  }
};

Operation/Attribute/Type Interface

  • Registered at the level of specific attribute/operation/type.
  • Can be defined with declarative syntax.
def ExampleOpInterface : OpInterface<"ExampleOpInterface"> {
  let description = [{
    This is an example interface definition.
  }];

  let methods = [
    InterfaceMethod<
      "Get the number of inputs for the current operation.",
      "unsigned", "getNumInputs"
    >,
  ];
}

Affine Dialect

func @calc(%arg0: memref<?xf32>, %arg1: memref<?xf32>, 
           %arg2: memref<?xf32>, %len: index) {
  %c1 = constant 1 : index
  %1 = alloc(%len) : memref<?xf32>
  affine.for %arg4 = 1 to 10 {
    %7 = affine.load %arg0[%arg4] : memref<?xf32>
    %8 = affine.load %arg1[%arg4] : memref<?xf32>
    %9 = addf %7, %8 : f32
    affine.store %9, %1[%arg4] : memref<?xf32>
  }
  affine.for %arg4 = 1 to 10 {
    %7 = affine.load %1[%arg4] : memref<?xf32>
    %8 = affine.load %arg1[%arg4] : memref<?xf32>
    %9 = mulf %7, %8 : f32
    affine.store %9, %arg2[%arg4] : memref<?xf32>
  }
  return
}

Affine Dialect

The Good!

  • Dialect for polyhedral optimizations
  • Provides array, loop and condition operations.
  • Comes with optimizations for loop fusion, tiling, unrolling, vectorization, etc.

Affine Constraints

  • Array index can only be specific values
  • No partial indexing for arrays
  • Every affine loop must finish to the end

Affine Dialect

The Bad!

  • Heavily depended on memref type.

    • Array values used in affine operations must be of memref type.

What is FORTRAN?

A VERY OLD LANGUAGE FOR WORKING WITH ARRAYS

subroutine f1dc(a1,a2,ret)
  integer a1(0:4), a2(0:4), ret(0:4)
  integer t1(0:4)

  do i = 0,4
     t1(i) = a1(i) + a1(i)
  end do
  do i = 0,4
     ret(i) = t1(i) * a2(i)
  end do
end subroutine f1dc

F18

F18

  • New Fortran backend in LLVM Project
  • Uses MLIR for code generation
  • Defines a new FIR dialect with Fortran specific operations
func @_QPf1dc(%arg0: !fir.ref<!fir.array<5xi32>>, ...) {
  %c0 = constant 0 : index
  %0 = fir.alloca i32 {name = "i"}
  %1 = fir.alloca !fir.array<5xi32> {name = "t1"}
  %4 = fir.do_loop %arg3 = %2 to %3 step %c1 -> (index) {
    fir.store %10 to %0 : !fir.ref<i32>
    ...
    %15 = fir.coordinate_of %1, %14 :
       (!fir.ref<!fir.array<5xi32>>, i64) -> !fir.ref<i32>
    ...
    %27 = fir.load %26 : !fir.ref<i32>
    %28 = addi %21, %27 : i32
    fir.store %28 to %15 : !fir.ref<i32>
    %29 = addi %arg3, %c1 : index
    fir.result %29 : index
  }
  %8 = fir.do_loop %arg3 = %6 to %7 -> (index) {
    ...
  }
  ...
  return
}

Affine Promotion->Demotion

  • Analyze which loops adhere to affine constraints
  • Convert indexing calculations and memory operations to affine counterparts
  • Convert conditionals nested in loops.
func @_QPf1dc(%arg0: !fir.ref<!fir.array<5xi32>>, ...) {
  %0 = fir.alloca i32 {name = "i"}
  %1 = fir.alloca !fir.array<5xi32> {name = "t1"}
  %2 = fir.convert %arg0 : 
    (!fir.ref<!fir.array<5xi32>>) -> memref<?xi32>
  %3 = fir.convert %1 : 
    (!fir.ref<!fir.array<5xi32>>) -> memref<?xi32>
  affine.for %arg3 = 0 to 5 {
    %6 = affine.load %2[%arg3] : memref<?xi32>
    %7 = affine.load %2[%arg3] : memref<?xi32>
    %8 = addi %6, %7 : i32
    affine.store %8, %3[%arg3] : memref<?xi32>
  }
  %4 = fir.convert %arg1 : 
    (!fir.ref<!fir.array<5xi32>>) -> memref<?xi32>
  %5 = fir.convert %arg2 : 
    (!fir.ref<!fir.array<5xi32>>) -> memref<?xi32>
  affine.for %arg3 = 0 to 5 {
    %6 = affine.load %3[%arg3] : memref<?xi32>
    %7 = affine.load %4[%arg3] : memref<?xi32>
    %8 = muli %6, %7 : i32
    affine.store %8, %5[%arg3] : memref<?xi32>
  }
  return
}

Affine Dialect

The Ugly!

  • memref lowers to a structure with memory region and other fields
  • Affine promotion adds dummy convert operations to convert fir memory types to memref
  • Affine demotion removes these convert operations and replaces with original types

Conclusion

  • MLIR: a framework to help in compiler construction
  • Allows combining multiple dialects together
  • Provides reusable optimizations, transformation, and analysis.

Extensible compilers, reusable optimizations and LLVM

By Rajan Walia

Extensible compilers, reusable optimizations and LLVM

  • 286