Debugging and Tracing in Linux: From Kernel to User Space

Debugging and Tracing in Linux: From Kernel to User Space

Debugging and tracing are critical skills for developers working on low-level systems, device drivers, or performance-sensitive applications. In Linux, tools and APIs exist for both kernel/driver development and user-space programming, each tailored to their respective environments. This article introduces essential debugging techniques, with concrete examples for tools like printk(), dev_dbg(), printf(), gdb, ftrace, and more.

Normally we often use printk() for kernel and printf() for user space, others are included to provide an overview.


1. Kernel and Driver Development

1.1 Logging with printk()

The printk() function is the kernel’s equivalent of printf(). It supports log levels (e.g., KERN_INFO, KERN_ERR) to categorize messages, which can be viewed using dmesg.

Example: Logging in a Kernel Module

#include <linux/init.h>  
#include <linux/module.h>  

static int __init my_module_init(void) {  
    printk(KERN_INFO "my_module: Initialized\n");  
    return 0;  
}  

static void __exit my_module_exit(void) {  
    printk(KERN_INFO "my_module: Exited\n");  
}  

module_init(my_module_init);  
module_exit(my_module_exit);  
        

View logs with:

dmesg | grep "my_module"  
        

1.2 Device-Specific Logging: dev_dbg() and dev_err()

The dev_*() family of functions (e.g., dev_dbg(), dev_err()) include device context (e.g., PCI address) in logs, making them ideal for driver code. Unlike printk(), dev_dbg() messages are dynamically enabled at runtime, reducing overhead when debugging is off.

Example: Using dev_dbg() in a Driver

void probe(struct device *dev) {  
    dev_dbg(dev, "Probing device\n"); // Debug message (disabled by default)  
    if (error)  
        dev_err(dev, "Probe failed: %d\n", error); // Always printed  
}  
        

Enabling Dynamic Debugging

To activate dev_dbg() messages for a specific driver (e.g., my_driver.c):

echo 'file my_driver.c +p' > /sys/kernel/debug/dynamic_debug/control  
        

How It Works

  • Prerequisite: The kernel must be compiled with CONFIG_DYNAMIC_DEBUG=y (enabled in most distributions).
  • /sys/kernel/debug/dynamic_debug/control: A virtual file that controls dynamic debug behavior. Writing commands here modifies debug output at runtime.
  • Command Breakdown:file my_driver.c: Target debug messages in the my_driver.c source file.+p: Enable printing of debug messages. Use -p to disable.

Behind the Scenes The kernel uses macros like dev_dbg() or pr_debug() to mark debug statements. These are compiled into the kernel but remain inactive until explicitly enabled. For example:

// Kernel source snippet using dev_dbg()  
dev_dbg(dev, "Initializing DMA buffer at %p\n", buffer);  
        

When dynamic debugging is enabled for my_driver.c, this message is printed with device context:

[ 12.345] my_driver 0000:01:00.0: Initializing DMA buffer at 0xffff8a0001a2f000  
        

Advanced Usage

  • Enable debugging for all functions in a module:echo 'module my_driver +p' > /sys/kernel/debug/dynamic_debug/control
  • Match specific line numbers or functions:echo 'file my_driver.c line 42-58 +p' > ... # Enable lines 42-58 echo 'func my_driver_function +p' > ... # Enable a specific function

Why This Matters

Dynamic debugging avoids recompiling the kernel or module for minor debugging tasks. It’s invaluable for diagnosing issues in production systems where rebooting is costly. Combined with dmesg -wH (to monitor logs in real-time), developers can iteratively refine debug output without disrupting system operation.


2. User-Space Debugging

2.1 printf() Debugging

The classic printf() (or fprintf(stderr, ...)) is useful for quick checks.

Example: Debugging a Memory Leak

void process_data() {  
    void *ptr = malloc(1024);  
    printf("Allocated memory at %p\n", ptr); // Track allocations  
    free(ptr);  
}  
        

2.2 GNU Debugger (gdb)

gdb inspects running processes, sets breakpoints, and analyzes crashes.

Example: Debugging a Segmentation Fault Compile with -g, then run:

gcc -g -o my_program my_program.c  
gdb ./my_program  
        

In gdb:

(gdb) break main      # Set breakpoint at main()  
(gdb) run            # Start execution  
(gdb) next           # Step to next line  
(gdb) print ptr      # Inspect variable  
        

2.3 System Call Tracing with strace

strace traces system calls made by a process.

Example: Tracing File Operations

strace -e openat,read,close ls /tmp  
        

Output:

openat(AT_FDCWD, "/tmp", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3  
read(3, "file1\nfile2\n"..., 32768) = 24  
close(3)  
        

3. Advanced Tracing Tools

3.1 ftrace

ftrace is a kernel-built-in tracer for analyzing latency and function calls.

Example: Tracing Function Execution

cd /sys/kernel/tracing  
echo function > current_tracer  
echo devm_kmalloc > set_ftrace_filter  
echo 1 > tracing_on  
# Run workload...  
echo 0 > tracing_on  
cat trace  
        

Output:

# tracer: function  
#           TASK-PID   CPU#  TIMESTAMP  FUNCTION  
          my_program-1234  [001] 456.789: devm_kmalloc <-device_probe  
        

3.2 perf

perf profiles CPU performance, including hardware counters.

Example: Profiling CPU Usage Count CPU events (e.g., cache misses):

perf stat -e cache-misses,instructions ./my_program  
        

Generate a flame graph:

perf record -g ./my_program    # Record call stack  
perf script > out.stack  
./FlameGraph/stackcollapse-perf.pl out.stack | ./FlameGraph/flamegraph.pl > graph.svg  
        

4. Choosing the Right Tool

5. Conclusion

Debugging in Linux spans multiple layers:

  • Kernel/Drivers: Use printk(), dev_*() functions, and ftrace. Leverage dynamic debugging (CONFIG_DYNAMIC_DEBUG) to enable verbose logs without recompiling.
  • User-Space: Start with printf() and gdb, then escalate to strace or perf.
  • Performance: Combine ftrace and perf for hardware-level insights.

By mastering these tools, developers can efficiently diagnose issues from driver misbehavior to user-space performance bottlenecks.

Ajay Kumar Kothapally

Senior Engineer @Sasken Technologies | C Language | Linux System Programming | Linux Device Drivers | Linux Kernel | Validation | AOSP |Android

4 天前

Very informative

回复
Santosh Kumar

Sr. Technical Lead | Automotive C++ | Telematics | Infotainment | Ethernet, CAN, LIN | System Design | Virtualization | dSpace | Embedded Systems | Data Structures & Algorithms | AI/ML | Vehicle Networks | AWS Cloud

4 天前

Great post! Another powerful tool worth mentioning is **SystemTap**. It allows you to write scripts to monitor and trace the activities of a running Linux system, providing a deeper understanding of both kernel and user-space behavior. Additionally, **BPF (Berkeley Packet Filter)** has evolved into a robust framework for performance analysis and security monitoring. Leveraging these tools can provide a more comprehensive debugging strategy. Keep exploring and happy debugging! ??

回复

要查看或添加评论,请登录

David Zhu的更多文章

社区洞察