AArch64

ARM's 64-bit Architecture


by David Thomas — JANUARY 2014


(UNFINISHED)

Expected Audience

Engineers familiar with ARM

Introduction

ARM's new 64-bit architecture

(It's not that new: its existence has been public since ~2011)


Already available in Apple hardware
iPhone 5, iPad Mini Retina, iPad Air

Soon to be available in Android hardware

The 64-bit ARM architecture is called AArch64 not ARM64
(Apple's tools go and call it 'arm64' anyway.)

The 32-bit ARM+Thumb architecture is retconned AArch32
AArch32 + AArch64 = ARMv8

Register Set

  • 32 x 64-bit wide registers
    • Generally accessible as x0..x30
  • Special registers
    • x30 - link register
    • x31 - multipurpose register; sometimes reads as zero
  • Registers are addressable as both 32-bit and 64-bit
    • 32-bit ops use the bottom half of the 64-bit registers
    • There are widening and sign extending load ops etc.

Instruction Set

Substantial changes from ARM

  • Conditional execution is gone :-[
    • But it's for the best: branch predictors have obsoleted it
  • No multi-reg load/store ops
    • But you do get paired load/stores
  • New fused test-and-branch ops
    • CBZ, CBNZ, TBNZ
  • Divide instructions
    • SDIV, UDIV
  • No equivalent of Thumb

Procedure Calling StandarD

ARM define the AArch64 PCS, [link].

Of course, Apple have to have their own variant on it, [link].

Common Patterns

Forming Large Constants

  • (Like ARM) form constants by moving pieces into a register
    • e.g. To move 0xA1B2C3D4 into xN
movz xN, #0xA1B2, lsl #16 ; move and zero the other register bits (xN = 0x00000000A1B20000)
movk xN, #0xC3D4 ; move and keep the other register bits (xN = 0x00000000A1B2C3D4)

  • (Like ARM) only worthwhile for certain constants
    • Load from memory for complex constants
  • (Like ARM) can invert the sense of loads for more range

movn xN, #0xA1B2, lsl #16 ; move NOT (xN = 0xFFFFFFFF5E4DFFFF)

Common Patterns

Function entry & exit

Makes use of the load-store pair instructions
stp x29, x30, [sp, #-16]!
mov x29, sp
<function body>
ldp x29, x30, [sp], #16
Register x29 here is the frame pointer

COMMON PATTERNS

Load 32-bit valueS as 64-bit

ldrsw x2, [x8] ; load register x2 with the Signed Word at x8

COMMON PATTERNS

Sequence of compares

Sequences of conditional comparisons can be chained

cmp   x8, #0x0 ; flags = (x8 cmp 0x0) ccmp x1, #0x0, #0x4, ne ; if (ne) flags = (x1 cmp 0x0) else flags = 0x4; ccmp x20, #0x0, #0x4, ne ; if (ne) flags = (x20 cmp 0x0) else flags = 0x4; b.eq 0x1879c ; if (eq) branch ...

Somewhat baroque...

Alternative Syntaxes

The implict xzr zero register can appear in some instructions' disassembly:
orr x9, xzr, #0x01

Were we writing this by hand we'd use:


mov x9, #0x01 

Ambiguous: could also be assembled as:

movz x9, #0x01 

Aliases:
movz x2, #0xe0 
...ought to be disassembled as
mov x2, #0xe0 

Made with Slides.com