Syscall in Linux kernel, is an interface to access to kernel basic functions. They are described in section 2 of man pages. The introduction is in man 2 syscall
(indirect system call), and the list of functions are described in man 2 syscalls. Update: System Calls in lectures of official Linux kernel documentation including « Linux system calls implementation », « VDSO and virtual syscalls » and « Accessing user space from system calls »
This article follow previous one about RISC-V overall progress and available tools to play with, I will try to make a short article here about Linux syscall usage and the RISC-V assembly case.
Table of Content
* Description section of the man page
* Getting the list of function and how to access them
* Passing parameters
* Function number and registers of return values
* Return values and error code
* Compiling and executing on virtual environment
* Update: Bronzebeard assembler and its baremetal environment for real hardware
Description section of the man page
* syscall() is a small library function that invokes the system call whose assembly language interface has the specified number with the specified arguments. Employing
* syscall() is useful, for example, when invoking a system call that has no wrapper function in the C library.
* syscall() saves CPU registers before making the system call, restores the registers upon return from the system call, and stores any error returned by the system call in errno(3).
* Symbolic constants for system call numbers can be found in the header file
You can find here function, like access to files open/close/read/write/flush, access to sockets, ioctl, uid, gid, pid, messages, ptrace, restart system, etc…
Getting the list of function and how to access them
As far I know, now only a part of syscall functions are accessible easily in assembly, they are defined in /usr/include/unistd.h
, and function numbers assigned in ABI are defined in /usr/include/asm-generic/unistd.h
.
The more practical match I found is using /usr/include/asm-generic/unistd.h
to see which function are available and there respective manpage for the function header definition. For example:
* asm-generic: #define __NR_read 63
* man 2 read: ssize_t read(int fd, void *buf, size_t count);
The ABI with RISC_V as defined in man 2 syscall in section Architecture calling conventions use the registers following 2 tables rules.
Passing parameters
The second table in this section of the man page shows the registers used to pass the system call arguments.
Arch/ABI arg1 arg2 arg3 arg4 arg5 arg6 arg7 Notes
──────────────────────────────────────────────────────────────
riscv a0 a1 a2 a3 a4 a5 -
Here are the arguments in the order of the function definition, for example, in read (63) function:
ssize_t read ( int fd, void *buf, size_t count );
a0(result) = a7(63)( a0(fd), a1(*buf), a2(count) )
For remember, 3 standard I/O file descriptors are STDIN=0
, STDOUT=1
, STDERR=2
, the other are used when opened a file with open
and closed by close
.
So we set the arguments as this. x0
is the always 0 register:
addi a0, x0, 0 # Set STDIN as sources
la a1, buffer_addr # load address of helloworld
addi a2, x0, 3 # reaad 3 bytes
Function number and registers of return values
And the first one give the register in which put the function number, that will receive return value and errno (error value)
Arch/ABI Instruction System Ret Ret Error Notes
call # val val2
───────────────────────────────────────────────────────────────────
riscv ecall a7 a0 a1 -
So for the read function, we need to put 63 (as found in /usr/include/asm-generic/unistd.h
in register a7
, and registers and following registers will receive the return values, a0
will receive system call result, and a1
an error message (the errno
value).
ssize_t read(int fd, void *buf, size_t count);
a0 = 63 ( a0, a1, a2)
So can set them as this:
addi a7, x0, 63 # set called function as read()
ecall # call the function
Return values and error code
After the man page of read(2):
* On success, the number of bytes read is returned
* On error, -1 is returned, and errno is set to indicate the error.
So we can test first the return value of a0 and if a0 < 0 then we jump to part for display the error message, else we can simply display a OK message. So for the branching part, RISC-V in its super reduced set only have < (lt) and <= (le) comparators, you just need to swap registers to compute > (gt) and >= (ge), but this avoid lots more of transistors.
addi a3,x0,0 # x3=0
blt a1,a3, error_seq # if x1<0 branch to error_seq
We so use here the syscall write function (64) defined as:
* asm-generic: #define __NR_write 64
* man 2 write: ssize_t write(int fd, const void *buf, size_t count);
* So, registers: a0=1 (STDOUT), a1=*buf, a2=count, a7=64 (function number)
And we will finish with exit() syscall, defined as:
* asm-generic: #define __NR_exit 93
* man 2 exit: noreturn void _exit(int status);
* So, registers: a0=return code, a7=93 (function number)
la a1, ok # load address (pseudo code) of ok string
addi a2, x0, 3 # set length of text to 3 (O + K + \n)
addi a7, x0, 64 # set ecall to write function
ecall # Call the function
addi a0, x0, 0 # set return code to 0 (OK) for exit (93) function
j end # unconditional jump to end before quit
error_seq:
la a1, error # load address (pseudo code) of error string
addi a2, a2, 0x30 # add 0x30 (0 ASCII code) to the error code
sb a2, 7(a1) # put the (byte) value at position 7 of Error string (before \n)
addi a2, x0, 10 # set now length of our string
addi a7, x0, 64 # set ecall to write function
ecall # Call the function
addi a0, x0, -1 # set return code to -1 (error) for exit (93) function
end:
addi a7, x0, 93 # set ecall to exit (93) function
ecall # Call linux to terminate the program
.data:
ok: .ascii "OK\n"
error: .ascii "Error: \n"
Compiling and executing on virtual environment
If you don't have a RISC-V hardware (can be found as low as 3€ now), you need to have a cross compiler and qemu for emulating instructions, or a whole system installed.
Packages needed for compiling on ArchLinux x86 or ARM for example.
sudo pacman -S riscv64-linux-gnu-gcc riscv64-linux-gnu-glibc riscv64-elf-binutils riscv64-elf-binutils riscv64-elf-gcc riscv64-elf-gdb
Note: If you have the following error, just comment the .data:
line by a #
test.s: Assembler messages:
test.s:22: Error: symbol `.data' is already defined
Newlib is a lightweight RV32 (RISC-V 32bits) lightweight library for bare metal that can be used instead of a whole GNU system on embedded devices with low memory capacity (as Longan nano, less than 8€ with screen, see picture below, or 3€ Sipeed RV): riscv32-elf-newlib.
I made a simple shell script to don't have to remember the commands to assemble the code from an x86 platform (work also on ARM or RISC-V one) that take the .s as argument:
name=$1
riscv64-linux-gnu-as -march=rv64imac -o ${name}.o ${name}.s
riscv64-linux-gnu-ld -o ${name} ${name}.o
You can add a strip but better to avoid it if you need to debug it:
riscv64-elf-strip --strip-all ${name}
And it can be executed on non RISC-V platforms by using qemu-riscv64, if it doesn't depend on libraries or if you have them installed, it allow you to test it without having a full RISC-V system installed, qemu is so fantastic. On ArchLinux it is available in package qemu-arch-extra
:
qemu-riscv64 ${name}
And can be disassembled (will probably use different instruction than your assembly code, due to RISC-V assembly pseudo-instructions:
riscv64-linux-gnu-objdump -d ${name}
Bronzebeard assembler and its baremetal environment for real hardware
Update: Bronzebeard is an assembler with light baremetal environment builder for RISC-V, GD32VF103 as Longan Nano (about 8€ with a screen, as pictured on this article pictures) and Wio (similar board with an added ESP8266 SoC). I made an AUR package of Bronzebeard, and someone made a Mandelbrot set demo in 918 bytes pure RISC-V assembly. You can find some other example in the source of Bronzebeard. gd32vf103inator is a set of tools for GD32V, to manage from a simple random text editor.