CPSC 355: Tutorial 4

Control Flow

PhD Student

Fall 2017

Outline

  • More ARMv8 instructions
  • Control flow (if else, while loops)
  • Macro preprocessor

ARMv8

Registers keep track of results and store data:

  • 31 general purpose 64 bit registers x0-x30 (use w0-w30 for the lower 32 bits of a register).
  • 64 bit stack pointer sp (and wsp)
  • 64 bit zero register xzr (and wzr)
  • 64 pit pc (program counter) address of current instruction
  • x0-x7 are used to pass variables to a function and return results
  • x9-x15 are temporary registers
  • x19 - x28 are callee-saved registers
  • x29 is the frame pointer, x30 is the link register (keeps track of return addresses)

Only ever expect x19-x28 to be saved after you call a function, the rest may be trashed

ARMv8

LDR: Load register

LDR     xd, =label            // Load immediate

         xd = address of label;

LDR     xd, [xn]               // Read value at memory

         xd = *xn;  

 

ARMv8

STR: Store register

 

STR     xn, [xd]               // Write value to memory

         *xd =  xn;  

 

ARMv8

CMP: Compares two operands, sets flags depending on the result

cmp     xn, xm

         sets various flags based on the comparison between          xn and xm

 

ARMv8

B.CC: Compares two operands, sets flags depending on the result

b.cc     label

          jumps to the label based on the condition flags                   currently set

Name Meaning
b.eq branch if equal
b.ne branch if not equal
b.gt branch greater than
b.ge branch greater than or equal
b.lt branch less than
b.le branch less than or equal

Use cmp before this instruction to set flags

Example

// File: scanfexample.s
// Author: Joshua Horacsek
// Date: September 27th, 2017
//
// Description: 
// Demonstrates using scanf, and the ldr pseudo-instruction
input_str:       .string "%d"
output_str:      .string "Your integer: %d\n"
                 .balign 4                 // Makes sure the following instruction's address is 
                                           // divisible by 4, i.e. aligned to the word length of
                                           // the machine
                 .global main              // Enusre that the "main" label is visible to the linker
main:
                 stp  x29, x30, [sp, -16]! // Save FP and LR to the stack
                 mov  x29, sp              // Set FP to the stack addr

                 ldr  x0, =input_str       // Set arg1 to the address of input_str
                 ldr  x1, =temp            // set arg2 to the address of temp
                 bl   scanf                // scanf("%d", temp);
                 ldr  x1, =temp            // x1 = temp;
                 ldr  x19, [x1]            // x19 = *x1;

                 ldr  x0, =output_str      // Set arg1 to the address of output_str
                 mov  x1, x19              // Set arg2 to x19
                 bl printf                 // Call printf("Your integer: %d\n", x19)

                 ldp x29, x30, [sp], 16    // Restore the stack
                 ret                       // return to OS

.data
temp:            .word   1

ARMv8 Control Flow

Obviously we need to control the flow of execution in order to make more sophisticated programs

  • For loops (pre and post testing)
  • If construct
  • If-else construct

ARMv8 Loops

Pre-test loops (in C):

int i = 0; 
while(i < 20) {
    printf("i = %d\n", i);
    i++;
}
          mov x19, 0              // Initialize variable
          b loop_test             // branch to the test
loop_top:

          ldr  x0, =printf_string // "i = %d\n"
          mov  x1, x19
          bl   printf             // printf("i = %d", x19)
          
          add  x19, x19, 1        // i++
loop_test:
          cmp  x19, 20            // i < 20?
          b.lt loop_top           // if so, loop
          

Pre-test loops (in assembly):

ARMv8 Loops

Pre-test loops (in C):

int i = 0; 
while(i < 20) {
    printf("i = %d\n", i);
    i++;
}
for(int i = 0; i < 20; i++) {
    printf("i = %d\n", i);
}

For loops are also "pre-test" loops

These loops do exactly the same thing

ARMv8 Loops

Post-test loops (in C):

int i = 0; 
do {
    printf("i = %d\n", i);
    i++;
} while(x <= 20);
          mov x19, 0              // Initialize variable
loop_top:
          ldr  x0, =printf_string // "i = %d\n"
          mov  x1, x19
          bl   printf             // printf("i = %d", x19)
          
          add  x19, x19, 1        // i++
loop_test:
          cmp  x19, 20            // i <= 20?
          b.le loop_top           // if so, loop
          

Post-test loops (in ASM):

This very slight optimization saves an instruction.

ARMv8

The if construct:

if(a > b) {
   result = a - b
}
          cmp  x19, x20            // x19 = a, x20 = b
          b.le next

          subs x21, x19, x20
next:     // more instructions ..

Equivalent ASM:

ARMv8

The if-else construct:

if(a > b) {
   result = a - b
} else {
   result = b - a
}
          cmp  x19, x20            // x19 = a, x20 = b
          b.le else

          subs x21, x19, x20
          b    next
else:
          subs x21, x20, x19
next:     // more instructions ..

Equivalent ASM:

M4 Preprocessor

Macro preprocessor, similar to the pre-processor that C has (anything with a # in C is a macro command)

Invoke it on the command line via:

m4 myfile.asm > myfile.s
gcc myfile.s -o myfile

It allows us to set aliases for registers, via define. For example we can do:

define(my_var_r, x19)

M4 Preprocessor

// File: scanfexample.asm
// Author: Joshua Horacsek
// Date: September 27th, 2017
//
// Description: 
// Demonstrates using scanf, and the ldr pseudo-instruction

// Define aliases to some registers
define(input_r, x19)

input_str:       .string "%d"
output_str:      .string "Your integer: %d\n"
                 .balign 4                 // Makes sure the following instruction's address is 
                                           // divisible by 4, i.e. aligned to the word length of
                                           // the machine
                 .global main              // Enusre that the "main" label is visible to the linker
main:
                 stp  x29, x30, [sp, -16]! // Save FP and LR to the stack
                 mov  x29, sp              // Set FP to the stack addr

                 ldr  x0, =input_str       // Set arg1 to the address of input_str
                 ldr  x1, =temp            // set arg2 to the address of temp
                 bl   scanf                // scanf("%d", temp);
                 ldr  x1, =temp            // x1 = temp;
                 ldr  input_r, [x1]        // input_r = *x1;

                 ldr  x0, =output_str      // Set arg1 to the address of output_str
                 mov  x1, input_r          // Set arg2 to input_r
                 bl printf                 // Call printf("Your integer: %d\n", input_r)

                 ldp x29, x30, [sp], 16    // Restore the stack
                 ret                       // return to OS

.data
temp:            .word   1

ARMv8

A more complicated example:

#include<stdio.h>

int main(int argc, char *argv) {
    int number = 0, exponent = 0, result = 0;

    puts("Enter a number:");
    scanf("%d", &number);

    puts("Enter an exponent:");
    scanf("%d", &exponent);

    if(exponent <= 0) {
        puts("Invalid exponent");
    } else {
        result = number;
        for(int i = 1; i < exponent; i++) {
            result *= number;
        }
        printf("%d^%d = %d\n", number, exponent, result);
    }
}

ARMv8

Assembly version

// File: expexample.asm
// Author: Joshua Horacsek
// Date: September 27th, 2017
//
// Description: 
// Takes a number and an exponent, and returns the result
// of taking the number and raising it to the exponent

define(number_r,   x19)
define(exponent_r, x20)
define(result_r,   x21)
define(i_r,        x22)
.data           
temp:           .word 1

.text
ent_num_str:    .string "Enter a number:"
ent_exp_str:    .string "Enter an exponent:"
inv_exp_str:    .string "Invalid exponent"
output_str:     .string "%ld^%ld=%ld\n"
scan_str:       .string "%ld"

                .balign 4
                .global main
main:

                stp  x29, x30, [sp, -16]! // Save FP and LR to the stack
                mov  x29, sp              // Set FP to the stack addr

                mov  number_r, 0          // Initialize registers to known values
                mov  exponent_r, 0
                mov  result_r, 0

ARMv8

Continued...


                ldr  x0, =ent_num_str     // Set arg 1 as ent_num_str
                bl   puts                 // puts(x0);

                ldr  x0, =scan_str        // Set arg 1 to addr of scan str
                ldr  x1, =temp            // Set arg 2 to addr of temp
                bl   scanf                // scanf("%d", temp);
                
                ldr  x1, =temp            
                ldr  number_r, [x1]       // number_r = *temp
                
                ldr  x0, =ent_exp_str     // Set arg 1 to addr of ent_exp_str
                bl   puts                 // puts(ent_exp_str);                

                ldr  x0, =scan_str        // Set arg 1 to addr of scan str
                ldr  x1, =temp            // Set arg 2 to addr of temp
                bl   scanf                // scanf("%d", temp);
                
                ldr  x1, =temp
                ldr  exponent_r, [x1]     // exponent_r = *temp
                
                cmp  exponent_r, xzr      // compare the exponent to the zero reg
                b.gt else                 // if exponent_r <= 0, continue

                ldr  x0, =inv_exp_str      
                bl   puts                 // puts(inv_exp_str)
                b    exit                 // Skip to the end
else: 
                mov  i_r, 1               // i_r = 1
                mov  result_r, number_r   // result_r = number_r


ARMv8

Continued...

                b loop_test                        // Do the loop test before all else
loop_top:
                mul result_r, result_r, number_r   // result_r *= number_r
                add i_r, i_r, 1                    // i_r++

loop_test:      cmp i_r, exponent_r                // compare i_r, exponent_r
                b.lt loop_top                      // if i_r < exponent_r, continue looping the top of the loop
                
                ldr x0, =output_str                // Arg 1 = addr of output_str
                mov x1, number_r                   // Arg 2 = number_r
                mov x2, exponent_r                 // Arg 3 = exponent_r
                mov x3, result_r                   // Arg 4 = result_r
                bl  printf                         

exit:           ldp x29, x30, [sp], 16             // Restore the stack
                ret                                // return to OS

To compile

m4 expexample.asm > expexample.s
gcc expexample.s -o expexample

Next Day

Work period, bring your questions.

CPSC 355: Tutorial 4

By Joshua Horacsek

CPSC 355: Tutorial 4

  • 2,176