Intro to
x86-64 ASM & SIMD
Assembly Basics
- Registers
- Move Semantics & Addressing
- Compare Semantics
- Loop/Jump Semantics
Integer and SIMD Registers
Move Semantics
- Instructions -
- MOV = Move
- MOVD = Move Doubleword
- MOVQ = Move Quadword
Move Semantics
- MOV Variants -
- MOV r64, r/m64
- MOV r/m64, r64
- MOV r64, imm*
Move Semantics
- Very Basic Addressing -
import std.stdio;
struct Test
{
ulong l1 = 64;
ulong l2 = 0;
}
void main()
{
Test t;
ulong* l1Ptr = &t.l1;
writeln("l2 before:", t.l2);
asm pure nothrow @nogc
{
mov RAX, l1Ptr;
mov RBX, [RAX];
mov [RAX+8], RBX;
}
writeln("l2 after:", t.l2);
}
Compare Semantics
- Instructions -
CMP = Compare
- CMP r/m64, imm32
- CMP r/m64, r64
Jump Semantics
- Instructions -
- JMP = Direct jump to address
- Jcc = Conditional jump
- JE, JNE = Jump if =,!=
- JA, JAE or JG, JGE = Jump if >,>=
- JB, JBE or JL, JLE = Jump if <,<=
Loop Example
import std.stdio;
void main()
{
ulong t = 0;
ulong* tPtr = &t;
writeln("t before:", t);
asm pure nothrow @nogc
{
mov RBX, tPtr;
mov RCX, [RBX];
mov RDX, 0;
loop:
add RCX, 10;
inc RDX;
cmp RDX, 2;
jle loop;
mov [RBX], RCX;
}
writeln("t after:", t);
}
Single Instruction, Multiple Data (SIMD)
- Instruction Sets
- Mnemonics
- Move Semantics
- Action Instructions
SIMD Instruction Sets
- MMX
- SSE
- SSE2
- SSE3
- SSSE3
- SSE4
- AVX
- AVX2
- AVX-512
Mnemonics (FP)
- Single = S
- Double = D
- Scalar = S
- Packed = P
Move Semantics
- Instructions -
- MOVA*
- MOVU*
- MOVL*
- MOVH*
- MOVDDUP (SSE3)
- SHUFP*
- MOVLHPS
Action Semantics
- Instructions -
- ADD*
- SUB*
- MUL*
- DIV*
- SQRT*
- MAX*
- MIN*
Examples
Confidence Adjustment
import std.stdio;
void main()
{
double[2] values = [15.4, 36.7];
const double[2] newValues = [40.0, 20.4];
const double[2] confidences = [1.0, 0.5];
double* valuesPtr = values.ptr;
const(double)* newValuesPtr = newValues.ptr;
const(double)* confidencesPtr = confidences.ptr;
asm pure nothrow @nogc
{
mov RSI, valuesPtr;
mov RDI, newValuesPtr;
mov RAX, confidencesPtr;
//Load confidences = c
movupd XMM1, [RAX];
//Load values = v0
movupd XMM2, [RSI];
//Load new values = v1
movupd XMM0, [RDI];
//v1 = (v0 * c) + v1 - (v1 * c) = (v0 * c) + (v1 * (1 - c))
//v0 * c
mulpd XMM2, XMM1;
//(v0 * c) + v1
addpd XMM2, XMM0;
//v1 * c
mulpd XMM0, XMM1;
//(v0 * c) + v1 - (v1 * c)
subpd XMM2, XMM0;
movupd [RSI], XMM2;
}
writeln(values);
}
0-branching Div-by-zero Clear
Questions?
Intro to Basic x86-64 Assembly and SIMD
By Jonathan Crapuchettes
Intro to Basic x86-64 Assembly and SIMD
- 2,077