When writing assembly code, nearly all instructions deal with the register file. The register file is the most direct form of memory available on a computer. It is small and fast, and can be accessed multiple times in one clock cycle.
There are times when we need to access main memory, however. We have very few registers available, so often need to use main memory just to store all of our variables. Also, we must sometimes use memory to store register values to prevent them from being overwritten.
Load instructions read a value from main memory into a register. There are different load instructions for different sizes:
Instruction | Meaning |
ldr | Load a word (32 bits). |
ldrb | Load a byte (8 bits). |
ldrh | Load a half word (16 bits). |
ldrsb | Load a signed byte. |
ldrsh | Load a signed half. |
All of these instructions load values into a register. Registers are always 32-bits. The difference between signed and unsigned loads is how they extend the smaller values.
These load instructions take the destination register as the first operand, then they take the address as the second operand. The first part of the address is a register containing the base address. For instance, to load from an address in r0 into r1, we could use the instruction:
ldr r1, [r0]
This is equivalent to the following in C:
r1 = *r0;
We can also include an offset after the base. This offset can be either a register or an immediate value.
For instance we could use the following instructions:
ldr r1, [r0, #1]
ldr r1, [r0, r2]
These are equivalent to the following C lines:
r1 = r0[1]
r1 = r0[r2]
One place where we need to use main memory is with arrays. Arrays are always passed between functions by their address. If we pass an array such as a string to a function, then the address will be passed in r0.
The following function implements the strlen function which takes a string, and counts the number of characters before the NULL terminator:
@ strlen.s
/* assembly version of strlen */
.global mystrlen
mystrlen:
mov r1, #0
top:
ldrb r2, [r0, r1]
cmp r2, #0
beq done
add r1, r1, #1
b top
done:
mov r0, r1
mov pc, lr
The full example can be seen here.
This works by keeping track of the count (r1) and using this to index the string (stored at r0). We load the current value and compare it to 0. When it is equal, we return. Otherwise, we increment the index.
To write to memory, we can use a store instruction. These work similarly to the load instructions. To store a word from r0 into the memory address stored in r1, we could use:
str r0, [r1]
This does the same thing as the C code:
*r1 = r0;
We can also use an offset with the store instruction:
str r0, [r1, #10]
str r0, [r1, r2]
The C code equivalent to these is:
r1[10] = r0
r1[r2] = r0
We also have different store instructions for different data sizes:
Instruction | Meaning |
str | Store a word (32 bits). |
strb | Store a byte (8 bits). |
strh | Store a half word (16 bits). |
Because stored values are put into byte-addressed memory, we don't have to worry about signed vs. unsigned.
When we declare local variables in a C function, they must be stored some place. They can't be stored globally, because then we could not use recursion. The variables must be created in a new space each time a function is called.
To support this, computer systems have a special section of memory called the stack. Typically, when a function is called, it creates some space on the stack for its arguments and local variables.
The reason that it is called the stack is because it exhibits the same "Last In First Out" behavior as a stack of objects. Consider the following code:
void f(int x) {
printf("%d", x);
}
void g(int x) {
f(x + 1);
}
void h(int x) {
g(x * 2);
}
int main( ) {
h(7);
return 0;
}
When this code runs, execution starts in main, then goes to h, then g, then f. When the functions begin to return, the chain of execution then goes back from f, back to g, then h and finally back to main where the program ends:
When a program runs, the stack is maintained to keep track of which function we are in. The block for each function is called a "stack frame" or "activation record" and contains information about the function like the values of the parameters, the local variables, and where the function is supposed to return to.
Typically when functions are called, they create space on the stack for their data. When it returns, they remove that space. This supports recursion because, if a function calls itself, each call will create separate stack frames.
The compiler inserts code to do this automatically when you compile C functions. In assembly we will have to deal with it ourselves!
All computer systems have a stack. On the GBA, the stack is located inside of IWRAM. It begins at address 0x0300:7F00. The sp register (r13) is a pointer to the beginning of the stack. On the GBA, the stack grows up, from higher addresses to lower addresses. On some computer systems, it goes the other way.
In addition to local variables, we often need to store registers on the stack in order to protect them. Recall that, according to the register convention of ARM, we are only allowed to overwrite registers r0 through r3. Registers r4 through r11 have to be preserved across function calls.
If we overwrite these registers, there is no telling what will happen. The following function overwrites all of them. If we call this function from main, the program will not work at all:
@ function.s
/* assembly function which uses r4 - r11
this will break everything */
.global function1
function1:
@ trash all the locals
mov r4, #4
mov r5, #5
mov r6, #6
mov r7, #7
mov r8, #8
mov r9, #9
mov r10, #10
mov r11, #11
@ return 42
mov r0, #42
mov pc, lr
If we do want to use these registers, we'll need to first save the initial values, then use them, and then restore the value again.
To do that, we first must make space on the stack. Because the stack grows up, we have to subtract the number of bytes we need from the stack pointer.
We then store the registers onto the stack using str instructions at the start of the function. When the function is done, we load the values back into the registers using ldr, and add back to the stack pointer.
.global function2
function2:
@ make space on the stack
sub sp, sp, #32
@ push all the regs
str r4, [sp, #0]
str r5, [sp, #4]
str r6, [sp, #8]
str r7, [sp, #12]
str r8, [sp, #16]
str r9, [sp, #20]
str r10, [sp, #24]
str r11, [sp, #28]
@ use all the locals
mov r4, #4
mov r5, #5
mov r6, #6
mov r7, #7
mov r8, #8
mov r9, #9
mov r10, #10
mov r11, #11
@ restore all regs
ldr r4, [sp, #0]
ldr r5, [sp, #4]
ldr r6, [sp, #8]
ldr r7, [sp, #12]
ldr r8, [sp, #16]
ldr r9, [sp, #20]
ldr r10, [sp, #24]
ldr r11, [sp, #28]
@ clean up stack
add sp, sp, #32
@ return 42
mov r0, #42
mov pc, lr
To call a function in assembly, we can use the bl instruction which stands for "branch and link". Like any branch instruction, it jumps some place else in the code. "Link" means that the next following instruction's address will be put into lr.
The issue is that this will overwrite the existing value of lr! If we don't save its initial value, we will never be able to return:
@ calls.s
/* a function which multiplies two numbers */
.global multiply
multiply:
mul r0, r1, r0
mov pc, lr
/* a function which squares its argument */
.global square
square:
@ call multiply
mov r1, r0
bl multiply
mov pc, lr
This "square" function attempts to call "multiply", but by doing so it overwrites lr. multiply can return back to square, but how can square ever return?
To fix this, we need to save the initial value, then restore it before returning:
@ calls.s
/* a function which multiplies two numbers */
.global multiply
multiply:
mul r0, r1, r0
mov pc, lr
/* a function which squares its argument */
.global square
square:
@ save lr
sub sp, sp, #4
str lr, [sp]
@ call multiply
mov r1, r0
bl multiply
@ return
ldr lr, [sp]
add sp, sp, #4
mov pc, lr
The stack is essentially a function's temporary work space. Anything it needs to save can be put on the stack including local variables, registers needing to be preserved, and the return address.
Copyright © 2024 Ian Finlayson | Licensed under a Creative Commons BY-NC-SA 4.0 License.