[Linux-kernel][RISC-V] System Call Processing Routine - Code Walk-through

[Linux-kernel][RISC-V] System Call Processing Routine - Code Walk-through

Introduction

These days, I've spent a lot of time analyzing the Linux kernel based on RISC-V. In this post, I would like to summarize how a system call is triggered in the RISC-V based Linux kernel. For this analysis, I used assembly instructions and the crash utility program.

Code Walk-through

First, let's analyze the execution flow of a system call in the Linux kernel based on RISC-V.

SP:FFFFFFFF80D94A74|handle_exception:      csrrw      x4,sscratch,x4
SP:FFFFFFFF80D94A78|                       bnez       x4,0xFFFFFFFF80D94A96   ; x4,_save_context        

At the very last line of the savecontext function, it branches to the do_trap_ecall_u function.

SP:FFFFFFFF80D94A96|_save_context:         sd         x2,0x18(x4)   ; x2,24(x4)
SP:FFFFFFFF80D94A9A|   ld         x2,0x10(x4)   ; x2,16(x4)
...                    
SP:FFFFFFFF80D94B1C|   bgez       x20,0xFFFFFFFF80D94B28
SP:FFFFFFFF80D94B20|   auipc      x6,0xFFFF7    ; x6,1048567
SP:FFFFFFFF80D94B24|   jalr       x0,0x228(x6)   ; x0,552(x6) ; do_irq
SP:FFFFFFFF80D94B28|   slli       x5,x20,0x3
SP:FFFFFFFF80D94B2C|   auipc      x6,0x46C      ; x6,1132
SP:FFFFFFFF80D94B30|   addi       x6,x6,-0x69C   ; x6 ; 0xFFFFFFFF81200B34,x6,-1692
SP:FFFFFFFF80D94B34|   auipc      x7,0x46C      ; x7,1132
SP:FFFFFFFF80D94B38|   addi       x7,x7,-0x624   ; x7 ; 0xFFFFFFFF81200B3C,x7,-1572
SP:FFFFFFFF80D94B3C|   c.add      x5,x6
SP:FFFFFFFF80D94B3E|   bgeu       x5,x7,0xFFFFFFFF80D94B48
SP:FFFFFFFF80D94B42|   ld         x5,0x0(x5)    ; x5,0(x5)
SP:FFFFFFFF80D94B46|   c.jr       x5        

The scause register holds the exception code, and the system branches to the do_trap_ecall_u function based on this value.

For reference, the list of functions called for each exception in the RISC-V architecture is shown below:

https://elixir.bootlin.com/linux/v6.6.20/source/arch/riscv/kernel/entry.S
SYM_CODE_START(excp_vect_table)
	RISCV_PTR do_trap_insn_misaligned
	ALT_INSN_FAULT(RISCV_PTR do_trap_insn_fault)
	RISCV_PTR do_trap_insn_illegal
	RISCV_PTR do_trap_break
	RISCV_PTR do_trap_load_misaligned
	RISCV_PTR do_trap_load_fault
	RISCV_PTR do_trap_store_misaligned
	RISCV_PTR do_trap_store_fault
	RISCV_PTR do_trap_ecall_u /* system call */  //<<--
	RISCV_PTR do_trap_ecall_s
	RISCV_PTR do_trap_unknown
	RISCV_PTR do_trap_ecall_m
	/* instruciton page fault */
	ALT_PAGE_FAULT(RISCV_PTR do_page_fault)
	RISCV_PTR do_page_fault   /* load page fault */
	RISCV_PTR do_trap_unknown
	RISCV_PTR do_page_fault   /* store page fault */
excp_vect_table_end:        

When the do_trap_ecall_u() function's syscall_handler() routine runs, it calls the system call handler.

https://elixir.bootlin.com/linux/v6.6.20/source/arch/riscv/kernel/traps.c#L307
asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
{
	if (user_mode(regs)) {
		long syscall = regs->a7;

		regs->epc += 4;
		regs->orig_a0 = regs->a0;

		riscv_v_vstate_discard(regs);

		syscall = syscall_enter_from_user_mode(regs, syscall);

		if (syscall >= 0 && syscall < NR_syscalls)
			syscall_handler(regs, syscall);  //<<--
		else if (syscall != -1)
			regs->a0 = -ENOSYS;
...        

Let's analyze this with assembly instructions:

SP:FFFFFFFF80D8BBC8|1101      do_trap_ecall_u:   c.addi     x2,-0x20      ; x2,-32
SP:FFFFFFFF80D8BBCA|E822                         c.sdsp     x8,0x10(x2)   ; x8,16(x2)
...
SP:FFFFFFFF80D8BC7E|00475797    auipc      x15,0x475     ; x15,1141
SP:FFFFFFFF80D8BC82|C7A78793    addi       x15,x15,-0x386   ; 
SP:FFFFFFFF80D8BC86|97AA        c.add      x15,x10
SP:FFFFFFFF80D8BC88|639C        c.ld       x15,0x0(x15)   ; x15,0(x15)
SP:FFFFFFFF80D8BC8A|8526        c.mv       x10,x9
SP:FFFFFFFF80D8BC8C|9782        c.jalr     x15
...        

The following instructions update x15 to the address 0xFFFFFFFF812008F8 at 0xFFFFFFFF80D8BC82 address.

SP:FFFFFFFF80D8BC7E|00475797 auipc      x15,0x475     ; x15,1141
SP:FFFFFFFF80D8BC82|C7A78793 addi  x15,x15,-0x386   ; // x15= 0xFFFFFFFF812008F8        

The address 0xFFFFFFFF812008F8 is the location of the sys_call_table. The crash utility program reveals this.


crash_rv64> rd -S sys_call_table 0x100
ffffffff812008f8:  __riscv_sys_io_setup __riscv_sys_io_destroy
ffffffff81200908:  __riscv_sys_io_submit __riscv_sys_io_cancel
ffffffff81200918:  __riscv_sys_io_getevents __riscv_sys_setxattr
ffffffff81200928:  __riscv_sys_lsetxattr __riscv_sys_fsetxattr
ffffffff81200938:  __riscv_sys_getxattr __riscv_sys_lgetxattr
ffffffff81200948:  __riscv_sys_fgetxattr __riscv_sys_listxattr
ffffffff81200958:  __riscv_sys_llistxattr __riscv_sys_flistxattr
ffffffff81200968:  __riscv_sys_removexattr __riscv_sys_lremovexattr
ffffffff81200978:  __riscv_sys_fremovexattr __riscv_sys_getcwd
ffffffff81200988:  __riscv_sys_lookup_dcookie __riscv_sys_eventfd2
ffffffff81200998:  __riscv_sys_epoll_create1 __riscv_sys_epoll_ctl
ffffffff812009a8:  __riscv_sys_epoll_pwait __riscv_sys_dup
ffffffff812009b8:  __riscv_sys_dup3 __riscv_sys_fcntl
ffffffff812009c8:  __riscv_sys_inotify_init1 __riscv_sys_inotify_add_watch        

The x15 in the last line of the following code snippet is updated to the address for system call handler by adding a system call number that is stored in x10.

SP:FFFFFFFF80D8BC86|97AA c.add      x15,x10 // x15 holds the start address of the sys_call_table
SP:FFFFFFFF80D8BC88|639C c.ld       x15,0x0(x15)   ; x15,0(x15)
SP:FFFFFFFF80D8BC8A|8526 c.mv       x10,x9
SP:FFFFFFFF80D8BC8C|9782 c.jalr     x15 //<<---  __riscv_sys_write        

Summary:

When a write system call is triggered, the following functions are called:

handle_exception
 -> _save_context
   -> do_trap_ecall_u
     -> __riscv_sys_write        

10.17.2024

#RISC-V

#RISCV

Eduardo Lemos

Computer Science | Software Engineer | Operating Systems | Compilers | HPC

4 个月

Very interesting post, thx for sharing :)

回复
Nishant Bijjula

IT Analyst @ TCS

5 个月

Good information on system call handler for RISC-V architecture. Thank you kim.

Archit Saxena

Senior Linux kernel/Platform Engineer@ Qualcomm |Ex Meta, Cisco

5 个月

Very informative. What is your setup to the RISC-V machine?

回复
Yunseong Kim

lore.kernel.org/all/?q=Yunseong+Kim

5 个月

Thank you Austin for the sharing!

要查看或添加评论,请登录

Austin Kim的更多文章

社区洞察

其他会员也浏览了