Intro to

x86-64 ASM & SIMD

Assembly Basics

  • Registers
  • Move Semantics & Addressing
  • Compare Semantics
  • Loop/Jump Semantics

Integer and SIMD Registers

Move Semantics

- Instructions -

  • MOV = Move
  • MOVD = Move Doubleword
  • MOVQ = Move Quadword

Move Semantics

- MOV Variants -

  • MOV r64, r/m64
  • MOV r/m64, r64
  • MOV r64, imm*

Move Semantics
- Very Basic Addressing -

import std.stdio;

struct Test
{
    ulong l1 = 64;
    ulong l2 = 0;
}

void main()
{
    Test t;
    ulong* l1Ptr = &t.l1;

    writeln("l2 before:", t.l2);
    asm pure nothrow @nogc
    {
        mov RAX, l1Ptr;
        mov RBX, [RAX];
        mov [RAX+8], RBX;
    }
    writeln("l2 after:", t.l2);
}

Compare Semantics

- Instructions -

CMP = Compare

  • CMP r/m64, imm32
  • CMP r/m64, r64

Jump Semantics

- Instructions -

  • JMP = Direct jump to address
  • Jcc = Conditional jump
    • JE, JNE = Jump if =,!=
    • JA, JAE or JG, JGE = Jump if >,>=
    • JB, JBE or JL, JLE = Jump if <,<=

Loop Example

import std.stdio;

void main()
{
    ulong t = 0;
    ulong* tPtr = &t;

    writeln("t before:", t);
    asm pure nothrow @nogc
    {
        mov RBX, tPtr;
        mov RCX, [RBX];
        mov RDX, 0;
    loop:
        add RCX, 10;
        inc RDX;
        cmp RDX, 2;
        jle loop;

        mov [RBX], RCX;
    }
    writeln("t after:", t);
}

Single Instruction, Multiple Data (SIMD)

  • Instruction Sets
  • Mnemonics
  • Move Semantics
  • Action Instructions

SIMD Instruction Sets

  • MMX
  • SSE
  • SSE2
  • SSE3
  • SSSE3
  • SSE4
  • AVX
  • AVX2
  • AVX-512

Mnemonics (FP)

  • Single = S
  • Double = D
  • Scalar = S
  • Packed = P

Move Semantics

- Instructions -

  • MOVA*
  • MOVU*
  • MOVL*
  • MOVH*
  • MOVDDUP (SSE3)
  • SHUFP*
  • MOVLHPS

Action Semantics

- Instructions -

  • ADD*
  • SUB*
  • MUL*
  • DIV*
  • SQRT*
  • MAX*
  • MIN*

Examples

Confidence Adjustment

import std.stdio;
void main()
{
    double[2] values = [15.4, 36.7];
    const double[2] newValues = [40.0, 20.4];
    const double[2] confidences = [1.0, 0.5];

    double* valuesPtr = values.ptr;
    const(double)* newValuesPtr = newValues.ptr;
    const(double)* confidencesPtr = confidences.ptr;

    asm pure nothrow @nogc
    {
        mov RSI, valuesPtr;
        mov RDI, newValuesPtr;
        mov RAX, confidencesPtr;

        //Load confidences = c
        movupd XMM1, [RAX];
        //Load values = v0
        movupd XMM2, [RSI];
        //Load new values = v1
        movupd XMM0, [RDI];

        //v1 = (v0 * c) + v1 - (v1 * c) = (v0 * c) + (v1 * (1 - c))
        //v0 * c
        mulpd XMM2, XMM1;
        //(v0 * c) + v1
        addpd XMM2, XMM0;
        //v1 * c
        mulpd XMM0, XMM1;
        //(v0 * c) + v1 - (v1 * c)
        subpd XMM2, XMM0;

        movupd [RSI], XMM2;
    }
    writeln(values);
}

0-branching Div-by-zero Clear

Questions?

Intro to Basic x86-64 Assembly and SIMD

By Jonathan Crapuchettes

Intro to Basic x86-64 Assembly and SIMD

  • 1,929