A Linux Kernel Tracing Tutorial

本文通过实例演示如何使用Linux内核的追踪基础设施进行调试,特别是针对驱动程序初始化问题的传统 printk 调试方法。介绍了如何设置追踪环境并逐步分析驱动加载过程中的问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

A Linux Kernel Tracing Tutorial

 

Posted by Cliff Brake on 2012-01-31 | Be the First to Comment

The Linux kernel has a fairly extensive tracing infrastructure that is quite useful for debugging.  There are a number of things you can do with tracing, but the focus of this article will be the traditional printk type debugging we often end up doing to trace initialization issues with a driver.  The following links provide additional information on the linux kernel tracing infrastructure:

In this example, I am working on a new audio driver.  The typical experience with a new driver is that you install it and nothing happens because something is not registered correctly with the Linux driver model.  So, the first thing I do is start with with the platform_device_add() function in my drivers init function.  To observe the kernel activity around the kernel platform code, I can do the following:

cd /sys/kernel/debug/tracing/
echo 0 > tracing_on (keep trace from filling up until we set filter)
echo function_graph > current_tracer
echo platform* > set_ftrace_filter
echo 1 > tracing_on
cat trace_pipe (leave running in a different shell)
<insmod my driver>

After executing the above, we see the following.  For this example, trace_pipe is preferred because the trace is then emptied and only new information is shown.

0) + 30.518 us   |  platform_device_alloc();
0)               |  platform_device_add() {
0)   0.000 us    |    platform_uevent();
0) + 30.518 us   |  platform_uevent();
0)   0.000 us    |  platform_uevent();
0) + 30.518 us   |    platform_match();
0) + 30.518 us   |    platform_match();
0)   0.000 us    |    platform_match();
0)   0.000 us    |    platform_match();

...

0) + 30.518 us   |    platform_match();
0)   0.000 us    |    platform_match();
0)   0.000 us    |    platform_match();
0)   0.000 us    |    platform_match();
0)   0.000 us    |    platform_match();
0) ! 3936.767 us |  }
0) + 30.518 us   |  platform_uevent();
0) + 30.518 us   |  platform_device_alloc();

From the above, I can conclude that the platform_match() is not succeeding, because I would expect some more activity.  At this point I chose to add a printk:

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 7a24895..f9ce0c7 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -662,6 +662,8 @@ static int platform_match(struct device *dev, struct device_driver *drv)
        struct platform_device *pdev = to_platform_device(dev);
        struct platform_driver *pdrv = to_platform_driver(drv);

+       trace_printk("pdev->name = %s, drv->name = %s", pdev->name, drv->name);
+
        /* Attempt an OF style match first */
        if (of_driver_match_device(dev, drv))
                return 1;

Now, if I re-run the trace, I see the following:

 0)               |      /* pdev->name = soc_audio, drv->name = davinci_emac */
 0)   0.000 us    |    }
 0)               |    platform_match() {
 0)               |      /* pdev->name = soc_audio, drv->name = snd-soc-dummy */
 0)   0.000 us    |    }
 0)               |    platform_match() {
 0)               |      /* pdev->name = soc_audio, drv->name = soc-audio */
 0)   0.000 us    |    }
 0)               |    platform_match() {
 0)               |      /* pdev->name = soc_audio, drv->name = omap-pcm-audio */
 0)   0.000 us    |    }
 0) ! 4241.943 us |  } /* platform_device_add */

From the above, it looks like we have a simple mismatch between “soc_audio” and “soc-audio.”  Fixing this problem, and re-installing the module, we now have:

 0)               |    platform_match() {
 0)               |      /* pdev->name = soc-audio, drv->name = snd-soc-dummy */
 0)   0.000 us    |    }
 0)               |    platform_match() {
 0)               |      /* pdev->name = soc-audio, drv->name = soc-audio */
 0)   0.000 us    |    }
 0) + 91.553 us   |    platform_drv_probe();
 0) ! 4241.943 us |  } /* platform_device_add */

Now we can see that the names match, and the probe function is now being called.  At this point, we may want to turn on tracing of some additional functions to try to determine what is happening next.

echo "platform* snd* mydriver*" > set_ftrace_filter

And the result:

 0)               |      /* pdev->name = soc-audio, drv->name = snd-soc-dummy */
 0)   0.000 us    |    }
 0)               |    platform_match() {
 0)               |      /* pdev->name = soc-audio, drv->name = soc-audio */
 0) + 30.517 us   |    }
 0)               |    platform_drv_probe() {
 0)               |      snd_soc_register_card() {
 0) + 30.518 us   |        snd_soc_instantiate_cards();
 0) ! 17852.78 us |      }
 0) ! 17883.30 us |    }
 0) ! 22125.24 us |  } /* platform_device_add */

With the above additional information, we can continue to learn more about the flow through the kernel.

While all of the above could have been done with printk’s, it would have been more time consuming.  The kernel function tracing capabilities allow us to quickly get a high level view of the flow through the kernel without manually adding a bunch of printk statements.  The kernel tracing features are completely contained in the kernel without requiring additional user space utilities which makes it very convenient to use in embedded systems.  The low overhead is also important in resource constrained embedded systems.

 

 

<think>好的,我需要向用户介绍Linux内核的跟踪点(tracepoint)。首先,我要确定自己对tracepoint的理解是否正确。Tracepoint是内核中预定义的静态钩子,用于在运行时跟踪和调试内核行为。它们允许开发者在不重新编译内核的情况下插入调试代码。 接下来,我应该解释tracepoint的基本概念,包括它们的作用和优势。比如,与printk相比,tracepoint更高效,因为它们只在需要时启用,减少性能开销。此外,tracepoint是内核的一部分,稳定且可靠。 然后,需要说明tracepoint的工作原理。可能涉及到静态代码中的宏定义,比如TRACE_EVENT,这些宏在编译时生成相关结构。当启用tracepoint时,会调用注册的回调函数,否则几乎没有开销。 还要提到使用tracepoint的工具,如perf、ftrace和systemtap,这些工具帮助用户利用tracepoint收集数据。例如,使用perf list查看可用的tracepoint,然后用perf record记录数据。 另外,应该给出一个具体的使用示例,展示如何通过命令行工具使用tracepoint,这样用户能更直观地理解。例如,使用perf工具监控系统调用入口点。 最后,比较tracepoint与其他跟踪机制(如kprobes)的区别,强调tracepoint的稳定性和低开销,但需要内核版本支持,而kprobes更灵活但可能影响性能。 在组织内容时,要确保结构清晰,分点说明,同时遵循用户给定的格式要求,比如数学表达式使用$...$,独立公式用$$...$$。不过根据问题内容,可能数学公式较少,但需要注意中英文标点和术语的正确使用。 需要检查是否有技术细节错误,比如tracepoint的启用机制,是否真的在启用时才有回调,以及相关工具的命令是否正确。确保信息准确,避免误导用户。 总结,回答的结构大致分为:定义、工作原理、使用工具、示例、优缺点比较。这样用户能逐步了解tracepoint的各个方面。</think>Linux 内核跟踪点(Tracepoint)是内核开发者预置的静态代码钩子,用于在特定事件发生时收集系统运行信息。以下是逐步解析: 一、核心概念 $$Tracepoint = 静态钩子 + 事件触发 + 数据采集$$ 1. 静态性:在内核源码中明确定义(如`include/trace/events/`目录) 2. 低开销:未启用时仅产生1个无条件跳转指令 3. 稳定性:随内核版本迭代保持向后兼容 二、工作原理 1. 定义阶段:使用`TRACE_EVENT`宏定义事件 ```c // 示例:系统调用入口跟踪点 TRACE_EVENT(sys_enter, TP_PROTO(struct syscall_args *args), TP_ARGS(args), TP_STRUCT__entry(__field(int, nr)), TP_fast_assign(__entry->nr = args->nr;), TP_printk("nr=%d", __entry->nr) ); ``` 2. 插入阶段:在代码关键路径加入`trace_sys_enter()`调用 3. 触发机制:通过`/sys/kernel/tracing/events`启停跟踪 三、使用方式 1. 查看可用跟踪点: ```bash cat /sys/kernel/tracing/available_events ``` 2. 启用特定跟踪点(以调度器事件为例): ```bash echo 1 > /sys/kernel/tracing/events/sched/sched_switch/enable ``` 3. 采集数据: ```bash cat /sys/kernel/tracing/trace_pipe ``` 四、工具链支持 1. perf工具: ```bash perf record -e sched:sched_switch -a sleep 10 ``` 2. ftrace: ```bash echo function_graph > /sys/kernel/tracing/current_tracer ``` 3. BPF增强:通过libbpf实现高效数据采集 五、性能优势对比 | 指标 | Tracepoint | kprobe | printk | |-----------|------------|--------|--------| | 平均延迟 | 15-50ns | 200ns+ | 1μs+ | | 内存开销 | 固定结构 | 动态解析 | 日志缓冲 | | 稳定性 | 版本兼容 | 可能失效 | 始终有效 | 六、典型应用场景 1. 调度器行为分析(如`sched_switch`事件) 2. 系统调用追踪(如`sys_enter_read`) 3. 内存管理监控(如`mm_page_alloc`) 4. 网络协议栈调试(如`net_dev_queue`) 七、进阶技巧 1. 事件过滤: ```bash echo 'nr == 1' > /sys/kernel/tracing/events/syscalls/sys_enter_write/filter ``` 2. 参数格式化: ```bash cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format ``` 3. 性能优化:配合环形缓冲区实现零拷贝数据采集 八、与同类技术对比 1. vs kprobes:Tracepoint无需符号表,但需要内核预置 2. vs ftrace:Tracepoint是事件源,ftrace是采集框架 3. vs eBPF:Tracepoint常作为eBPF程序的挂载点 通过合理使用tracepoint,可以实现对Linux内核运行时行为的精确观测,是性能分析和故障诊断的重要基础设施。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值