登录查看更多内容

Debugging and Tracing in Linux: From Kernel to User Space

David Zhu

Linux driver developer

发布日期: 2025年3月18日

Debugging and tracing are critical skills for developers working on low-level systems, device drivers, or performance-sensitive applications. In Linux, tools and APIs exist for both kernel/driver development and user-space programming, each tailored to their respective environments. This article introduces essential debugging techniques, with concrete examples for tools like printk(), dev_dbg(), printf(), gdb, ftrace, and more.

Normally we often use printk() for kernel and printf() for user space, others are included to provide an overview.

1. Kernel and Driver Development

1.1 Logging with printk()

The printk() function is the kernel’s equivalent of printf(). It supports log levels (e.g., KERN_INFO, KERN_ERR) to categorize messages, which can be viewed using dmesg.

Example: Logging in a Kernel Module

#include <linux/init.h>  
#include <linux/module.h>  

static int __init my_module_init(void) {  
    printk(KERN_INFO "my_module: Initialized\n");  
    return 0;  
}  

static void __exit my_module_exit(void) {  
    printk(KERN_INFO "my_module: Exited\n");  
}  

module_init(my_module_init);  
module_exit(my_module_exit);

View logs with:

dmesg | grep "my_module"

1.2 Device-Specific Logging: dev_dbg() and dev_err()

The dev_*() family of functions (e.g., dev_dbg(), dev_err()) include device context (e.g., PCI address) in logs, making them ideal for driver code. Unlike printk(), dev_dbg() messages are dynamically enabled at runtime, reducing overhead when debugging is off.

Example: Using dev_dbg() in a Driver

void probe(struct device *dev) {  
    dev_dbg(dev, "Probing device\n"); // Debug message (disabled by default)  
    if (error)  
        dev_err(dev, "Probe failed: %d\n", error); // Always printed  
}

Enabling Dynamic Debugging

To activate dev_dbg() messages for a specific driver (e.g., my_driver.c):

echo 'file my_driver.c +p' > /sys/kernel/debug/dynamic_debug/control

How It Works

Prerequisite: The kernel must be compiled with CONFIG_DYNAMIC_DEBUG=y (enabled in most distributions).
/sys/kernel/debug/dynamic_debug/control: A virtual file that controls dynamic debug behavior. Writing commands here modifies debug output at runtime.
Command Breakdown:file my_driver.c: Target debug messages in the my_driver.c source file.+p: Enable printing of debug messages. Use -p to disable.

Behind the Scenes The kernel uses macros like dev_dbg() or pr_debug() to mark debug statements. These are compiled into the kernel but remain inactive until explicitly enabled. For example:

// Kernel source snippet using dev_dbg()  
dev_dbg(dev, "Initializing DMA buffer at %p\n", buffer);

When dynamic debugging is enabled for my_driver.c, this message is printed with device context:

[ 12.345] my_driver 0000:01:00.0: Initializing DMA buffer at 0xffff8a0001a2f000

Advanced Usage

Enable debugging for all functions in a module:echo 'module my_driver +p' > /sys/kernel/debug/dynamic_debug/control
Match specific line numbers or functions:echo 'file my_driver.c line 42-58 +p' > ... # Enable lines 42-58 echo 'func my_driver_function +p' > ... # Enable a specific function

Why This Matters

Dynamic debugging avoids recompiling the kernel or module for minor debugging tasks. It’s invaluable for diagnosing issues in production systems where rebooting is costly. Combined with dmesg -wH (to monitor logs in real-time), developers can iteratively refine debug output without disrupting system operation.

2. User-Space Debugging

2.1 printf() Debugging

The classic printf() (or fprintf(stderr, ...)) is useful for quick checks.

Example: Debugging a Memory Leak

void process_data() {  
    void *ptr = malloc(1024);  
    printf("Allocated memory at %p\n", ptr); // Track allocations  
    free(ptr);  
}

2.2 GNU Debugger (gdb)

gdb inspects running processes, sets breakpoints, and analyzes crashes.

Example: Debugging a Segmentation Fault Compile with -g, then run:

gcc -g -o my_program my_program.c  
gdb ./my_program

In gdb:

(gdb) break main      # Set breakpoint at main()  
(gdb) run            # Start execution  
(gdb) next           # Step to next line  
(gdb) print ptr      # Inspect variable

2.3 System Call Tracing with strace

strace traces system calls made by a process.

Example: Tracing File Operations

strace -e openat,read,close ls /tmp

Output:

openat(AT_FDCWD, "/tmp", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3  
read(3, "file1\nfile2\n"..., 32768) = 24  
close(3)

3. Advanced Tracing Tools

3.1 ftrace

ftrace is a kernel-built-in tracer for analyzing latency and function calls.

Example: Tracing Function Execution

cd /sys/kernel/tracing  
echo function > current_tracer  
echo devm_kmalloc > set_ftrace_filter  
echo 1 > tracing_on  
# Run workload...  
echo 0 > tracing_on  
cat trace

Output:

# tracer: function  
#           TASK-PID   CPU#  TIMESTAMP  FUNCTION  
          my_program-1234  [001] 456.789: devm_kmalloc <-device_probe

3.2 perf

perf profiles CPU performance, including hardware counters.

Example: Profiling CPU Usage Count CPU events (e.g., cache misses):

perf stat -e cache-misses,instructions ./my_program

Generate a flame graph:

perf record -g ./my_program    # Record call stack  
perf script > out.stack  
./FlameGraph/stackcollapse-perf.pl out.stack | ./FlameGraph/flamegraph.pl > graph.svg

4. Choosing the Right Tool

5. Conclusion

Debugging in Linux spans multiple layers:

Kernel/Drivers: Use printk(), dev_*() functions, and ftrace. Leverage dynamic debugging (CONFIG_DYNAMIC_DEBUG) to enable verbose logs without recompiling.
User-Space: Start with printf() and gdb, then escalate to strace or perf.
Performance: Combine ftrace and perf for hardware-level insights.

By mastering these tools, developers can efficiently diagnose issues from driver misbehavior to user-space performance bottlenecks.

Ajay Kumar Kothapally

4 天前

Very informative

Santosh Kumar

4 天前

Great post! Another powerful tool worth mentioning is **SystemTap**. It allows you to write scripts to monitor and trace the activities of a running Linux system, providing a deeper understanding of both kernel and user-space behavior. Additionally, **BPF (Berkeley Packet Filter)** has evolved into a robust framework for performance analysis and security monitoring. Leveraging these tools can provide a more comprehensive debugging strategy. Keep exploring and happy debugging! ??

查看更多评论

要查看或添加评论，请登录

David Zhu的更多文章

Understanding Linux GPIO Driver Structure: A Deep Dive

2025年3月20日

Understanding Linux GPIO Driver Structure: A Deep Dive

GPIO (General-Purpose Input/Output) drivers in the Linux kernel provide a standardized way to interact with hardware…

3 条评论
USB Transfer Types: Structured Overview

2025年3月20日

USB Transfer Types: Structured Overview

1. Control Transfers Purpose Primary Use: Device configuration and control operations.
Introduction to Linux Spinlocks and Comparison with Mutexes

2025年3月20日

Introduction to Linux Spinlocks and Comparison with Mutexes

In modern multi-core systems, synchronizing access to shared resources is critical. The Linux kernel provides several…

1 条评论
Understanding SPI Driver Patterns with the spi_stm32 Driver

2025年3月19日

Understanding SPI Driver Patterns with the spi_stm32 Driver

In Linux, many SPI-based drivers use two well-known design patterns: the Driver Data Pattern and the Controller…
Comprehensive Guide to Linux Driver Types

2025年3月19日

Comprehensive Guide to Linux Driver Types

Linux supports a wide range of device drivers to interface with hardware components. This article categorizes driver…
Understanding Linux Device Types and Their Drivers

2025年3月19日

Understanding Linux Device Types and Their Drivers

Introduction In Linux kernel development, various device types and their corresponding drivers form the foundation of…
Understanding struct mutex in Linux with Concrete Instances

2025年3月19日

Understanding struct mutex in Linux with Concrete Instances

A mutex (mutual exclusion) in Linux is a synchronization primitive designed to protect shared resources by allowing…

1 条评论
Why Cross-Compiled Kernels Work on Raspberry Pi—And Why Modules Sometimes Fail

2025年3月18日

Why Cross-Compiled Kernels Work on Raspberry Pi—And Why Modules Sometimes Fail

Cross-compiled kernels and modules can work seamlessly on a Raspberry Pi system due to several critical factors: 1…
USB Concepts for Linux Kernel Driver Developers

2025年3月17日

USB Concepts for Linux Kernel Driver Developers

A Practical Guide to Architecture, Transfers, and Enumeration 1. USB Architecture: Tiered Star Topology USB uses a…
Memory Allocation Methods in Linux Kernel and User Space Development

2025年3月17日

Memory Allocation Methods in Linux Kernel and User Space Development

Memory allocation is a fundamental aspect of software development, especially when working with hardware interfaces…

2 条评论

See all articles

1. Kernel and Driver Development

1.1 Logging with printk()

1.2 Device-Specific Logging: dev_dbg() and dev_err()

Enabling Dynamic Debugging

Advanced Usage

Why This Matters

2. User-Space Debugging

2.1 printf() Debugging

2.2 GNU Debugger (gdb)

2.3 System Call Tracing with strace

3. Advanced Tracing Tools

3.1 ftrace

3.2 perf

4. Choosing the Right Tool

5. Conclusion

David Zhu的更多文章

Understanding Linux GPIO Driver Structure: A Deep Dive

USB Transfer Types: Structured Overview

Introduction to Linux Spinlocks and Comparison with Mutexes

Understanding SPI Driver Patterns with the spi_stm32 Driver

Comprehensive Guide to Linux Driver Types

Understanding Linux Device Types and Their Drivers

Understanding struct mutex in Linux with Concrete Instances

Why Cross-Compiled Kernels Work on Raspberry Pi—And Why Modules Sometimes Fail

USB Concepts for Linux Kernel Driver Developers

Memory Allocation Methods in Linux Kernel and User Space Development

社区洞察