Table of contents

  1. Introduction
    1. Recap: Functions in C
  2. First steps towards common calling conventions
  3. Memory: the stack
    1. Understanding the stack
    2. Manipulating the stack in RISC-V
    3. Excursion: The stack grows downwards!?
  4. Using the stack for function calls
  5. Summary: Complete calling conventions
    1. Exercise 1
    2. Exercise 2
  6. Additional exercises
    1. Exercise 3
    2. Exercise 4

Introduction

In the last sessions, you have already gained some experience with assembly code in RISC-V and, by now, also have a basic understanding of C code. In this session, we will dive deeper into what difficulties arise when writing larger programs, how we can extend our registers by using memory, and how this helps us to write code elements that resemble functions as you know them from higher-level languages.

Recap: Functions in C

Let’s first take a look at functions in C. Below you see an example program with a helper function that calculates a simple sum.

sum.c Console
// Include I/O library functions (printf, scanf)
# include <stdio.h>

// This function allows us to calculate the sum of two numbers.
// With this, we can reuse the sum calculation at different
// locations of our code as an abstract operation.
int sum(int a, int b) {
    return a + b;
}

// Main function where the program starts
int main(void) {
    int n;
    printf("Enter a number:\n");
    scanf("%d", &n);
    printf("Result: n+2=%d\n", sum(n, 2));
    return 0;
}
$ gcc sum.c -o sum
$ ./sum
Enter a number:
5
Result: n+2=7

You already know functions from other languages than C, they are very useful to bundle common tasks or make the program simpler to think about. In essence, functions allow for abstraction via parameters and return values.

But how can we do this in assembly? There is almost no concept of a function in assembly. If languages like C are compiled to assembly code, how do they transform their functions to assembly code then? The answer is simple and slightly difficult at the same time:

  1. Functions can be implemented using common low-level primitives such as labels, jumps, and registers. Doing this is simple, and you may already have done so intuitively during the last sessions.
  2. However, making sure that you can always and deterministically reuse or call a function requires some effort. You may already have experienced that different developers use different registers to do the same thing. What if you use register t0 to pass a parameter to a function, but the other developer expects the parameter in t1? What happens if we run out of registers? These and other situations require for a clear set of (calling) conventions that we expect everyone to adhere to - even (or especially) compilers! This is the slightly difficult part of functions in assembly.

First steps towards common calling conventions

:bulb: TL;DR: The official calling conventions are here on the official RISC-V website. You can also find the short version on the RISC-V card that you will have access to during the tests.

If the above explanation did not fully click for you yet, don’t worry. Let’s go step by step and understand why conventions are useful and help us write code that can be used across programs and teams. Let’s take a look at the partial assembly code below. We already created a main and a sum label:

.globl main
.data
    a: .word 1
    b: .word 2
    number: .word 0

.text
main:
    # Load the two numbers a and b into registers

sum:
    # Add the two numbers and put them in a register for main to find.

resume:
    la t6, number
    sw a0, 0(t6)

:pencil: The instruction sw a0, 0(t6) means the following: take the memory address contained in the register t6, add 0 to it, and store the value from a0 at that memory location. You can of course substitute 0 for other integer values, adding zero is equivalent to writing sw a0, (t6).

This is useful if you want to access variables that are located next to each other in memory, and especially for iterating over arrays. Using this technique, you can iterate over a fixed-length array without having to change any registers. If the address of the start of the (integer) array is stored in t0, you can overwrite different elements of the array by changing this offset:

li t3, 5  # we want to overwrite the first three elements of the array with 5
sw t3, 0(t0)  # array[0] = 5
sw t3, 4(t0)  # array[1] = 5
sw t3, 8(t0)  # array[2] = 5

Here we change the offset by 4 bytes each time because integers take up 4 bytes in memory.

:fire: Warm-up 1: Extend the program above to sum the two numbers a and b and store it in number. You don’t need to perform function calls just yet, you can ignore the labels for now.

Solution

.globl main
.data
    a: .word 1
    b: .word 2
    number:  .word 0

.text
main:
    # Load the two numbers a and b into registers
    lw t0, a
    lw t1, b

sum:
    # Add the two numbers and put them in a register for main to find.
    add a0, t0, t1

resume:
    la t6, number
    sw a0, 0(t6)

Problems that arise when using functions in assembly code

Which registers did you use in the main to load a and b? What would happen to your program if the code in sum used different registers? It would no longer work. This is why the calling conventions of RISC-V assign different roles for different registers:

  • Some registers are to be used for input parameters and return values of functions
  • Callee-saved registers are to be restored by the called function at the end of its execution (if it had changed their values). In other words, the code that called the function can safely treat these registers as if their value was unchanged by the function call.
  • Caller-saved registers can be overwritten by the called function. In other words, the code that called the function should assume that these registers will be overwritten by the called function (and should be saved elsewhere if they are needed).
Register number Register name Use Note
x0 zero Always zero -
x1 - x4 ra, sp, gp, tp Various uses Partially explained below
x5 - x7 and x28-x31 t0 - t6 Temporary registers Caller must save these registers before calling a function it they need them later. Functions may at any time overwrite these registers!
x8 - x9 and x18 - x27 s0 - s11 Saved registers Callee must save these registers. Thus, you can safely use these registers in your code and any function that you call must back them up and restore them if they decide to use them too.
x10 - x17 a0 - a7 Function arguments Function arguments to called functions. Used as input parameters.
x10 and x11 a0 and a1 Return values Since the input to a function is not useful anymore on return, a0 and a1 have a dual use as registers for the return values.
addi s0, zero, 5
addi t0, zero, 5
jal fact         # function call!
add s1, t0, t0   # we don't know what is in t0, because it's caller-save!
add s1, s0, s0   # s0 is guaranteed to still contain 5, because it's callee-save!

If you take a look at the RISC-V card linked at the top of the website, you can now understand the register table.

Calling conventions solve interface issues

:fire: Warm-up 2: Change your program to adhere to the register conventions above as if the sum label was a function.

Solution

.globl main
.data
    a: .word 1
    b: .word 2
    number:  .word 0

.text
main:
    # Load the two numbers a and b into registers
	la a0, a
	la a1, b

sum:
    # Add the two numbers and put them in a register for main to find.
	add a0, a0, a1

resume:
    la t6, number
    sw a0, 0(t6)

:fire: Warm-up 3: Read the official calling conventions of RISC-V. Since we will not be making use of floating point registers (ft0 to ft11) you can skip Section 18.3. In this course we will just try to fit all parameters into the argument registers as it is explained there before utilizing the stack.

The official calling conventions talk about passing some function parameters via the stack. Let us now take a look at what happens if we run out of registers and how can we solve this by using the stack.

Memory: the stack

It is easy to think of cases where the 32 registers we have are not enough. You may already have come to a situation where this is the case, but for any larger program, we definitely need to store data in memory. Similarly, if the calling conventions only define 8 registers to pass function arguments, what happens if we want to pass more data to a function than fits into these 8 registers? You already worked with variables stored in the .data section: in a similar way, we could declare a region in memory that we use to back up or restore data from. However, defining specific variables is still not enough for cases where we do not know how many variables we will need. In this case, what we need is a data structure where we dynamically add and remove data, depending on what we need.

Understanding the stack

A stack is a simple data structure that grows in one direction and that works according to the Last-in-First-Out (LIFO) principle. The idea is as simple as a tower of books where you are only ever able to pick up the top-most book. You can place more books on top of the tower, but have to pick them up again to reach the books at the bottom.

A stack can help you to overcome all challenges that we described above:

  • If we run out of registers, we can temporarily push the variables onto the stack. As long as we remember where we can find the data and how many variables we pushed after this variable, we can always find it again.
  • If we want to send complex data to a function or want to exceed the 8 registers that we can use to send function arguments, we can use such a stack and simply pass the pointer to the data on the stack.
  • One additional issue we did not discuss yet is the problem of program control flow. Where should the function jump at the end? What happens on recursion? Or if a function wants to call another function? The functions should always remember who called them and where to return to. This is handled by the return address register (ra / x1), which also needs to be backed up to the stack during nested function calls.

Manipulating the stack in RISC-V

At this point, it may not surprise you anymore to hear that RISC-V actually has a dedicated register called the stack pointer, and that the calling conventions heavily rely on the mechanisms mentioned above:

  • The stack pointer (sp / x2) is already set up for you to point to the CPU stack. You can use it in your programs freely.
  • In RISC-V (and other architectures) the stack grows downwards. See the excursion below for more details.
  • To push data (e.g. t0) to the stack: first use addi sp, sp, −4 (decrement the stack pointer to allocate space on the stack); then use sw t0, 0(sp) to write t0 to the stack. The stack pointer should always point to the last pushed value (not to the empty space below it).
  • To pop data from the stack (e.g. in t0): first load the data at the top of the stack using lw t0, 0(sp); then update the stack pointer using addi sp, sp, 4.

:fire: Warm-up 4: Write a program that uses the sp register and the provided stack to push two integers (4 and 5 for example) onto the stack and then pop these integers again.

Solution

.globl main

# We use the program stack which is already set up for us.
# Thus, we don't need the data variable we called stack anymore.
# We can just use the stack pointer.
.text
main:
    # load the stack pointer
    # sp already points to the stack! Nothing to do.

    # load the integers
    li t1, 4
    li t2, 5

    # store the integers
    # WARNING! We are using the stack pointer here!
    # This means that we need to make space on the stack
    #  BEFORE we store the integers.
    addi sp, sp, -8 # Stack grows downwards
    sw t1, 0(sp)
    sw t2, 4(sp)

    # load the integers again
    lw t1, 0(sp)
    addi sp, sp, 4 # Stack grows downwards and we move it after we loaded the integers
    lw t2, 0(sp)
    addi sp, sp, 4

Excursion: The stack grows downwards!?

:warning: The stack controlled by the stack pointer in RISC-V grows downwards! It is still a stack, but instead of adding a variable on top of the older one, you place the variable below the older one in memory. This means that when your last element on the stack is at address 0x100, the next item of size 4 bytes will be at address 0x100 - 0x004 = 0x0FC (and not at 0x104).

If you want to understand more about this, read through the excursion below.

Excursion: Why does the stack grow downwards?

There is no technical reason for this, this is just convention. You could equally well define that the stack grows upwards in the direction that is probably more intuitive to you. However, this decision may make more sense when looking at the general, complete memory layout of RISC-V.

Since RISC-V is based on a Von Neumann architecture, the memory is used to store both instructions and data. Take a look at the memory overview below. You will see that in the memory, we need to store (from bottom to top, low address to high):

  1. Reserved memory at the very first addresses (bottom in the graphic, from address 0x00).
  2. Text section, i.e., instructions that are kept in memory (your program).
  3. Data section where all static data is kept.
  4. Heap data for dynamic memory. You will learn more about this in the next session.
  5. Stack.

Both the stack and the heap can grow during the execution of programs. Since we may not always know how large both can get during execution, it is simpler to just have them share a big block of memory and let them grow towards each other. Then, we only need to make sure they never cross each other, but until that happens, both can grow and shrink however they want.

Memory Layout in RISC-V

:bulb: Why does the stack grow downwards? Because it became a convention at some point that the stack grows downwards and the heap grows upwards. There are some reasons that make it a smart choice, but nowadays it is just a convention in RISC-V.

Using the stack for function calls

Calling a function means multiple things at once:

  1. We jump to a new address (or label) and expect the code there to execute.
  2. We may want to provide some input to the called function.
  3. We may expect some return values from the called function.
  4. The function is expected to jump back (i.e., return), to the code that originally called it.
  5. Any registers that are marked as caller-save must be saved before the function call and restored afterwards (if we want to preserve their value).
  6. Any registers that are marked as callee-save must remain untouched by the called function.

We have already discussed which registers are callee- and caller-saved. However, we did not discuss how these registers should be saved on the stack or in what order we should save them. The first thing to do when calling a function is to save the caller-saved registers on the stack. Since this procedure logically belongs to the caller function, it does not matter in which order you save the registers as long as you make sure that you restore them in the same order after returning from the function.

The next step to take when calling a function is to push on the stack the arguments that do not fit into registers. Before jumping to the function, we fill the ra register with the return address: the address of the instruction that we want to execute when returning from the function. Most often, this will be the instruction after the jump to the function. This is why there is a special instruction in RISC-V, called the jal (jump and link) instruction, that first places the address of the next instruction into the ra register and then jumps to the given address.

Summary: Complete calling conventions

The complete list of things to do when calling a function is as follows:

  1. Push caller-saved registers that you need to reuse to the stack (e.g. if you only reuse t0 and t1 you need to save them but you don’t need to save t2-7).
  2. Place all function parameters that fit into registers in a0 to a7.
  3. Push remaining function parameters onto the stack.
  4. Call the function either via jal or via another instruction as long as you make sure to store the return address to ra.

In the called function, do the following:

  1. Back up the return address register ra on the stack (if we’re planning to overwrite it either directly or by calling another function with jal from within the function).
  2. Back up other callee-saved registers that will be used by this function (e.g. if we do not touch s0-s7, we can skip saving them).
  3. Perform function tasks.
  4. Place function return in a0 and a1.
  5. Restore callee-saved registers and ra.
  6. Return to parent function via ret (or simply jump to ra).

With this complete calling conventions list, you should now understand the call stack diagram, which you can also find on the RISC-V card:

Call stack diagram

Below, you can find a series of images that walk you through an example program and the usage of the stack (click or slide through the images).

Show stack example code
.data
    x: .word 1
    y: .word 2
    z: .space 4
.globl main
.text
foo:
    add a0, a0, a1
    ret

bar:
    addi sp, sp, -4
    sw   ra, 0(sp)
    jal  foo
    lw   ra, 0(sp)
    addi sp, sp, 4
    ret

main:
    lw  a0, x
    lw  a1, y
    jal bar
    sw  a0, z, t0

Exercise 1

Convert the following code to RISC-V assembly using RISC-V calling conventions. Assume that main does not return like common functions. Why is this assumption necessary right now?

int x = 10;
int y = 20;
int z;

int double_it(int a) {
    return a + a;
}

int sum(int a, int b) {
    return a + double_it(b);
}

int main(void) {
    z = sum(x, y);
}

Solution

.data
    x: .word 0x10
    y: .word 0x20
    z: .space 4

.text

.globl main
doubleIt:
    # No need to execute the entire prologue/epilogue
    # For such simple functions
    add a0, a0, a0
    ret
sum:
    addi sp, sp, -8 # Make space for 2 words on the stack
    sw ra, 4(sp)    # Save the return address
    sw a0, 0(sp)    # Save the first argument a0 because of doubleIt call

    mv a0, a1       # b (currently in a1) is our input. Move it to a0 for doubleIt.
    jal doubleIt

    lw t0, 0(sp)    # Restore the saved argument (a) to t0
    add a0, t0, a0  # Add a (t0) to the result of doubleIt (a0)

    lw ra, 4(sp)    # Restore the return address
    addi sp, sp, 8  # Restore stack pointer
    ret
main:
    lw a0, x
    lw a1, y
    jal sum
    sw a0, z, t0

Exercise 2

The calling conventions exist to make writing programs together with other developpers easier. Imagine if everyone had their own way of passing arguments or even using registers in their functions. You would have to change the way you write your program based on who’s function you’re calling.

The following program does NOT follow the calling convetions, can you patch it up? (The lines you have to change/add are marked with comments!)

.data
    x: .word 4
    y: .word 5
    z: .space 4

.text

.globl main
square:
    mul s0, s0, s0  #1
    ret

sum:
    add a0, a0, a1
    j back          # 2
    
square_distance:
    addi sp, sp, -12 
    mv t0, ra       # 3
    sw a0, 0(sp) 
    sw a1, 4(sp)    
 
    mv s0, a0       # 4 This whole block is a mess...
    jal square	    #
    mv a0, s0	    #
    mv s0, a1	    #
    jal square	    #
    mv a1, s0	    #
    j sum	        # 5
back:

    mv ra, t0       # 6 Is this the correct way to do it?
                    # 7 What misses here
    ret
    
    
main:
    lw a0, x
    lw a1, y
    jal square_distance
    sw a0, z, t0

Solution

.data
    x: .word 4
    y: .word 5
    z: .space 4

.text

.globl main
square:
    mul a0, a0, a0  #1 Arguments are in a0, a1, a2, a3, a4, a5, a6, a7
    ret

sum:
    add a0, a0, a1
    ret             # 2 Use ret instead of j back
    
square_distance:
    addi sp, sp, -16 
    sw a0, 0(sp) 
    sw a1, 4(sp)    
    sw ra, 8(sp)    # 3 Save ra 
    sw s0, 12(sp)   # Save s0 

    jal square	    #
    mv s0, a0	    #
    mv a0, a1	    #
    jal square	    #
    mv a1, s0	    #
    jal sum	        # 5 Use jal instead of j

    lw ra, 8(sp)    # 6 Restore ra
    lw s0, 12(sp)   # Restore s0
    addi sp, sp, 16 # 7 Restore sp
    ret
    
    
main:
    lw a0, x
    lw a1, y
    jal square_distance
    sw a0, z, t0

Additional exercises

Exercise 3

We have compiled the function func from the code example below to 32-bit RISC-V (RV32I) using gcc. The compiled function can be found below. Translate the main function manually to RISC-V. Follow the calling conventions to pass all arguments correctly.

For this exercise, you will have to rely on the official RISC-V calling conventions. They define the size of the C types and how to pass values that do not fit in a single register!

:bulb: The nice thing about calling conventions is that you don’t have to understand the assembly code of func to be able to write the assembly code of main!

long long result;
long long i = 6;

long long func(int a, long b, long long c, long long d, long long e, long long *f) {
    return a + b + c + d + e + *f;
}

int main(void) {
    result = func(1, 2, 3, 4, 5, &i);
}

.data
    result:	.space 8
    i:		.dword 6

.text
.globl main
func:
    addi	sp,sp,-48
    sw	s0,44(sp)
    addi	s0,sp,48
    sw	a0,-20(s0)
    sw	a1,-24(s0)
    sw	a2,-32(s0)
    sw	a3,-28(s0)
    sw	a4,-40(s0)
    sw	a5,-36(s0)
    sw	a6,-48(s0)
    sw	a7,-44(s0)
    lw	a4,-20(s0)
    lw	a5,-24(s0)
    add	a5,a4,a5
    mv	t1,a5
    srai	a5,a5,0x1f
    mv	t2,a5
    lw	a2,-32(s0)
    lw	a3,-28(s0)
    add	a4,t1,a2
    mv	a1,a4
    sltu	a1,a1,t1
    add	a5,t2,a3
    add	a3,a1,a5
    mv	a5,a3
    mv	a2,a4
    mv	a3,a5
    lw	a0,-40(s0)
    lw	a1,-36(s0)
    add	a4,a2,a0
    mv	a6,a4
    sltu	a6,a6,a2
    add	a5,a3,a1
    add	a3,a6,a5
    mv	a5,a3
    mv	a2,a4
    mv	a3,a5
    lw	a0,-48(s0)
    lw	a1,-44(s0)
    add	a4,a2,a0
    mv	a6,a4
    sltu	a6,a6,a2
    add	a5,a3,a1
    add	a3,a6,a5
    mv	a5,a3
    mv	a2,a4
    mv	a3,a5
    lw	a5,0(s0)
    lw	a0,0(a5)
    lw	a1,4(a5)
    add	a4,a2,a0
    mv	a6,a4
    sltu	a6,a6,a2
    add	a5,a3,a1
    add	a3,a6,a5
    mv	a5,a3
    mv	a0,a4
    mv	a1,a5
    lw	s0,44(sp)
    addi	sp,sp,48
    ret

main:
    # TODO

Solution

.data
    result: .space 8
    i:      .dword 6

.text
.globl main
func:
    addi	sp,sp,-48
    sw	s0,44(sp)
    addi	s0,sp,48
    sw	a0,-20(s0)
    sw	a1,-24(s0)
    sw	a2,-32(s0)
    sw	a3,-28(s0)
    sw	a4,-40(s0)
    sw	a5,-36(s0)
    sw	a6,-48(s0)
    sw	a7,-44(s0)
    lw	a4,-20(s0)
    lw	a5,-24(s0)
    add	a5,a4,a5
    mv	t1,a5
    srai	a5,a5,0x1f
    mv	t2,a5
    lw	a2,-32(s0)
    lw	a3,-28(s0)
    add	a4,t1,a2
    mv	a1,a4
    sltu	a1,a1,t1
    add	a5,t2,a3
    add	a3,a1,a5
    mv	a5,a3
    mv	a2,a4
    mv	a3,a5
    lw	a0,-40(s0)
    lw	a1,-36(s0)
    add	a4,a2,a0
    mv	a6,a4
    sltu	a6,a6,a2
    add	a5,a3,a1
    add	a3,a6,a5
    mv	a5,a3
    mv	a2,a4
    mv	a3,a5
    lw	a0,-48(s0)
    lw	a1,-44(s0)
    add	a4,a2,a0
    mv	a6,a4
    sltu	a6,a6,a2
    add	a5,a3,a1
    add	a3,a6,a5
    mv	a5,a3
    mv	a2,a4
    mv	a3,a5
    lw	a5,0(s0)
    lw	a0,0(a5)
    lw	a1,4(a5)
    add	a4,a2,a0
    mv	a6,a4
    sltu	a6,a6,a2
    add	a5,a3,a1
    add	a3,a6,a5
    mv	a5,a3
    mv	a0,a4
    mv	a1,a5
    lw	s0,44(sp)
    addi	sp,sp,48
    ret

main:
    li a0, 1
    li a1, 2

    # c = a3 (high order bits) | a2 (low order bits)
    li a2, 3
    li a3, 0

    # d = a5 | a4
    li a4, 4
    li a5, 0

    # e = a6 | a5
    li a6, 5
    li a7, 0

    # t0 = &i
    la t0, i

    # push t0 (&i)
    addi sp, sp, -4
    sw   t0, 0(sp)

    # result = func(....)
    jal func
    sw  a0, result, t0

    # pop
    addi sp, sp, 4

Exercise 4

Fix the function sum_fixme in the below program. Only add code at the designated TODO points, don’t modify the existing code. Use the stack to make sure caller-saved registers are saved by the caller and callee-saved registers are saved by the callee. Note that sum_fixme acts both as a caller and a callee at different times. Your solution is correct if the execution terminates with

  • no errors;
  • the value 3 in a0;
  • the value 0xdeadbeef in s0.
.globl main
.text
callersaveregisterwipe: # Don't modify this function!
    mv t0, ra
    li ra, 0
    li t1, 0
    li t2, 0
    li t3, 0
    li t4, 0
    li t5, 0
    li t6, 0
    li a0, 0
    li a1, 0
    li a2, 0
    li a3, 0
    li a4, 0
    li a5, 0
    li a6, 0
    li a7, 0
    jr t0

sum_fixme:

    # TODO

    jal callersaveregisterwipe # Don't modify this line

    # TODO

    add s0, a0, a1  # Don't modify this line
    mv  a0, s0      # Don't modify this line

    # TODO

    ret # Don't modify this line

main: # Don't modify this function!
    li a0, 1
    li a1, 2
    li s0, 0xdeadbeef
    jal sum_fixme
    # Correct execution should terminate 1) Without errors, 2) with the value 3 in a0 AND 3) with the value 0xdeadbeef in s0

Solution

.globl main
.text
callersaveregisterwipe: #Don't modify this function!
    mv t0, ra
    li ra, 0
    li t1, 0
    li t2, 0
    li t3, 0
    li t4, 0
    li t5, 0
    li t6, 0
    li a0, 0
    li a1, 0
    li a2, 0
    li a3, 0
    li a4, 0
    li a5, 0
    li a6, 0
    li a7, 0
    jr t0

sum_fixme:

    addi sp, sp, -12 #Reserve 3 words on the stack

    sw   ra, 0(sp)   #Store caller-save registers on stack
    sw   a0, 4(sp)
    sw   a1, 8(sp)

    jal callersaveregisterwipe  #Don't modify this line

    lw   ra, 0(sp)   #Restore caller-save registers from stack
    lw   a0, 4(sp)
    lw   a1, 8(sp)

    sw   s0, 0(sp)   #Store callee-save register on stack so it can be used

    add  s0, a0, a1  #Don't modify this line
    mv   a0, s0      #Don't modify this line

    lw   s0, 0(sp)   #Restore callee-save register from stack
    addi sp, sp, 12  #Restore (callee-save) stack pointer before returning

    ret              #Don't modify this line

main:                #Don't modify this function!
    li a0, 1
    li a1, 2
    li s0, 0xdeadbeef
    jal sum_fixme
    #Correct execution should terminate 1) Without errors, 2) with the value 3 in a0 AND 3) with the value 0xdeadbeef in s0