Skip to content

Week 5 — Kernel Debugging: GDB, printk, ftrace, and More

Goal

Set up kernel debugging through QEMU's GDB stub. Learn to use printk effectively, trace kernel functions with ftrace, and use dynamic debug. These tools will be essential when your networking changes behave unexpectedly.

Why This Matters

You can't use printf and run your program again — the kernel is the program. Debugging kernel code requires specific tools and techniques. The good news: the kernel has excellent built-in debugging infrastructure, especially for networking.


GDB with QEMU

Kernel Config Requirements

Make sure these are enabled in your .config:

cd ~/linux
scripts/config --enable CONFIG_DEBUG_INFO
scripts/config --enable CONFIG_DEBUG_INFO_DWARF5
scripts/config --enable CONFIG_GDB_SCRIPTS
scripts/config --disable CONFIG_RANDOMIZE_BASE  # Same as nokaslr boot param
make olddefconfig
make -j$(nproc)

Launch QEMU with GDB Stub

Add -s -S to your QEMU command:

qemu-system-x86_64 \
  -kernel ~/linux/arch/x86/boot/bzImage \
  -drive file=~/qemu-debian.qcow2,format=qcow2 \
  -append "root=/dev/sda1 rw console=ttyS0 nokaslr" \
  -nographic \
  -m 2G -smp 4 -enable-kvm \
  -s -S
  • -s — start a GDB server on TCP port 1234
  • -S — freeze the CPU at startup (wait for GDB to continue)

Connect GDB

In another terminal:

cd ~/linux
gdb vmlinux

Inside GDB:

(gdb) target remote :1234
(gdb) continue

The kernel will start booting. You can interrupt it anytime with Ctrl+C in GDB.

Useful GDB Commands for Kernel Debugging

# Set a breakpoint on a kernel function
(gdb) break tcp_v4_connect
(gdb) continue

# When it hits, inspect
(gdb) bt              # Backtrace — see the call stack
(gdb) info locals     # Local variables
(gdb) print *sk       # Dereference a socket pointer
(gdb) p/x skb->len   # Print in hex

# Step through code
(gdb) next            # Step over (stay in current function)
(gdb) step            # Step into (enter called functions)
(gdb) finish          # Run until current function returns

# Set conditional breakpoints
(gdb) break tcp_sendmsg if sk->sk_state == 1

# Print a struct layout
(gdb) ptype struct sk_buff
(gdb) ptype struct net_device

GDB Helper Scripts

The kernel ships GDB Python scripts that understand kernel data structures:

(gdb) source ~/linux/scripts/gdb/vmlinux-gdb.py
(gdb) lx-dmesg                    # Print kernel log
(gdb) lx-lsmod                    # List modules
(gdb) lx-ps                       # List processes
(gdb) lx-symbols                  # Load module symbols

printk — The Kernel's printf

printk() is the primary debugging tool for most kernel developers. It writes to the kernel ring buffer, visible via dmesg.

Log Levels

pr_emerg("...");    // 0 — system is unusable
pr_alert("...");    // 1 — action must be taken immediately
pr_crit("...");     // 2 — critical conditions
pr_err("...");      // 3 — error conditions
pr_warn("...");     // 4 — warning conditions
pr_notice("...");   // 5 — normal but significant
pr_info("...");     // 6 — informational
pr_debug("...");    // 7 — debug (only with DEBUG defined or dynamic debug)

For development, use pr_info() or pr_debug(). The pr_* macros are preferred over raw printk() — they automatically include the module name.

Format Specifiers for Kernel Types

The kernel extends printf with kernel-specific format specifiers:

pr_info("IP address: %pI4\n", &addr);      // Prints 192.168.1.1
pr_info("IPv6: %pI6\n", &addr6);           // Prints full IPv6
pr_info("MAC: %pM\n", dev->dev_addr);      // Prints aa:bb:cc:dd:ee:ff
pr_info("Device: %s\n", netdev->name);     // eth0, etc.
pr_info("Function: %ps\n", func_ptr);      // Symbolic function name
pr_info("Backtrace: %pS\n", addr);         // Symbol + offset

These are documented in Documentation/core-api/printk-formats.rst.

Rate Limiting

Never use pr_info() in a hot path (e.g., per-packet). Use rate-limited variants:

pr_info_ratelimited("packet received, len=%d\n", skb->len);
// Or:
net_info_ratelimited("something happened\n");  // Networking-specific

ftrace — The Kernel Function Tracer

ftrace is a built-in tracing framework. No recompilation needed — it's already in your kernel if CONFIG_FTRACE=y (it is by default).

Enable and Use via tracefs

# In the guest
cd /sys/kernel/tracing

# See available tracers
cat available_tracers

# Trace all function calls
echo function > current_tracer
echo 1 > tracing_on

# Do something (e.g., ping)
ping -c 1 10.0.2.2

echo 0 > tracing_on
cat trace | head -50

Expect output like:

# tracer: function
#              TASK-PID    CPU#  TIMESTAMP  FUNCTION
#               | |         |        |         |
            ping-234   [001]  1234.567: ip_output <-ip_local_out
            ping-234   [001]  1234.567: ip_finish_output <-ip_output
            ping-234   [001]  1234.567: dev_queue_xmit <-ip_finish_output2

Trace Specific Functions

# Only trace TCP functions
echo 'tcp_*' > set_ftrace_filter
echo function > current_tracer
echo 1 > tracing_on

# Make a TCP connection in the guest
curl http://10.0.2.2:8080 2>/dev/null || true

echo 0 > tracing_on
cat trace | head -30

Function Graph Tracer

Shows call depth and return values:

echo function_graph > current_tracer
echo 'tcp_v4_connect' > set_graph_function
echo 1 > tracing_on

# trigger a TCP connect...

echo 0 > tracing_on
cat trace

Output shows the call tree:

  0)               |  tcp_v4_connect() {
  0)               |    inet_hash_connect() {
  0)   0.123 us    |      inet_sk_port_offset();
  0)   1.456 us    |    }
  0)               |    tcp_connect() {
  0)   0.234 us    |      tcp_init_nondata_skb();
  0)   0.567 us    |      tcp_transmit_skb();
  0)   3.789 us    |    }
  0) + 12.345 us   |  }

This is incredibly valuable for understanding how networking functions call each other.

Dynamic Debug

Enable debug messages at runtime without recompiling:

# See all available debug messages
cat /sys/kernel/debug/dynamic_debug/control | head -20

# Enable debug messages for a specific file
echo 'file tcp_input.c +p' > /sys/kernel/debug/dynamic_debug/control

# Enable for a specific function
echo 'func tcp_rcv_established +p' > /sys/kernel/debug/dynamic_debug/control

# Disable
echo 'file tcp_input.c -p' > /sys/kernel/debug/dynamic_debug/control

Requires CONFIG_DYNAMIC_DEBUG=y (usually enabled).

/proc and /sys for Network Debugging

The kernel exposes enormous amounts of networking state:

# TCP connection state
cat /proc/net/tcp

# Network statistics
cat /proc/net/netstat

# Interface statistics
cat /proc/net/dev

# Routing table
cat /proc/net/route
ip route show

# Socket info
ss -tnp

KASAN and Other Debug Options

Enable these in menuconfig for development builds:

scripts/config --enable CONFIG_KASAN            # Kernel Address Sanitizer
scripts/config --enable CONFIG_KASAN_INLINE     # Faster KASAN
scripts/config --enable CONFIG_LOCKDEP          # Lock dependency checker
scripts/config --enable CONFIG_PROVE_LOCKING    # Locking correctness
scripts/config --enable CONFIG_DEBUG_ATOMIC_SLEEP # Detect sleeping in atomic
scripts/config --enable CONFIG_DEBUG_LIST        # Linked list debug
scripts/config --enable CONFIG_FORTIFY_SOURCE    # Buffer overflow detection
make olddefconfig

Warning: KASAN significantly slows the kernel (2-3x). Use it during testing, not for performance benchmarks. But it catches real bugs — including bugs in networking code.

Exercises

  1. Boot your kernel with -s -S. Connect GDB. Set a breakpoint on tcp_v4_connect. From the guest, run curl http://10.0.2.2 and watch GDB hit the breakpoint. Print the backtrace.
  2. Add a pr_info("carlos: tcp_sendmsg called, len=%d\n", msg->msg_iter.count); to tcp_sendmsg() in net/ipv4/tcp.c. Rebuild, boot, send some data, and check dmesg.
  3. Use ftrace's function_graph tracer to trace dev_queue_xmit. Ping something and read the call graph. Identify every function between your ping and the virtual NIC.
  4. Enable dynamic debug for net/ipv4/tcp_input.c. Make a TCP connection and read the debug output in dmesg.
  5. Enable KASAN, rebuild, and boot. Run some networking workloads. Check dmesg for any KASAN reports (there may be none — that's fine, it means no bugs found).

What's Next

Next week we learn kernel-specific C idioms — linked lists, error handling, memory allocation, and locking. This is the "language" you need to speak to write kernel code.