备注:本文使用内核版本linux-6.1.31,转载请注明出处
eBPF介绍
eBPF(extended Berkeley Packet Filter)是一个强大的内核技术,它允许在内核中执行用户定义的代码片段。eBPF最初是为网络包过滤器而设计的,但现在已经扩展到其他领域,例如性能分析、安全监控和系统跟踪。
eBPF的设计目标是在不修改内核源代码的情况下,提供一种安全且高效的方式来扩展内核功能。它通过在内核中创建一个安全的虚拟机(eBPF虚拟机),使用户能够在内核中运行自定义的代码片段,这些代码片段称为eBPF程序。这些程序使用一种特定的中间语言(eBPF字节码)编写,并且在运行时经过验证和限制,以确保其安全性和可靠性。
以下是eBPF的一些关键特点和用途:
-
灵活性:eBPF允许开发人员编写高级的程序逻辑,以在内核中处理事件和数据。它提供了丰富的API和功能,可以访问和操作内核中的各种数据结构和事件流。
-
安全性:eBPF程序在运行时受到严格的验证和限制,以确保其不会对系统造成损害。它们运行在内核提供的安全的虚拟机中,使用一组受限制的指令和资源,以防止恶意代码的执行。
-
高性能:由于eBPF程序运行在内核空间中,它们可以直接访问和操作内核中的数据,而无需进行用户空间和内核空间之间的上下文切换。这种低开销的执行方式使得eBPF在高性能的数据包处理、性能分析和跟踪方面非常有用。
-
扩展性:eBPF提供了一种可扩展的机制,允许在运行时动态加载和卸载eBPF程序。这使得在不重启系统或修改内核的情况下,可以动态地添加、删除或更新内核功能。
-
应用领域:eBPF已经在多个领域得到广泛应用。在网络方面,它用于数据包过滤、流量监控、网络安全和网络功能虚拟化等。在系统性能方面,它可以用于系统调用跟踪、函数追踪、动态追踪和性能优化。此外,它还可以用于安全监控、容器和云环境中的资源管理等。
eBPF运行需要的基础包
sudo apt install -yclang llvm libelf-dev libbpf-dev bpfcc-tools libbpfcc-dev binutils-dev pkg-config m4 libelf-dev gcc-multilib libpcap-dev
kernel需要的配置项
kernel最终需要增加的是CONFIG_DEBUG_INFO_BTF=y,为了增加这一项,有如下配置项需要增加或者删除
一些配置项和CONFIG_DEBUG_INFO_BTF冲突,所以必须要删除
一些配置项是CONFIG_DEBUG_INFO_BTF存在的前提,所以必须存在
分析过程
lib/Kconfig.debug下有DEBUG_INFO_BTF配置生效要求
config DEBUG_INFO_BTF
bool "Generate BTF typeinfo"
depends on !DEBUG_INFO_SPLIT && !DEBUG_INFO_REDUCED
depends on !GCC_PLUGIN_RANDSTRUCT || COMPILE_TEST
depends on BPF_SYSCALL
depends on !DEBUG_INFO_DWARF5 || PAHOLE_VERSION >= 121
help
Generate deduplicated BTF type information from DWARF debug info.
Turning this on expects presence of pahole tool, which will convert
DWARF type info into equivalent deduplicated BTF type info.
所以,得出结论
要删除的配置项如下
DEBUG_INFO_SPLIT
DEBUG_INFO_REDUCED
GCC_PLUGIN_RANDSTRUCT
DEBUG_INFO_DWARF5
必须包含的配置项
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
分析原因
lib/Kconfig.debug下DEBUG_INFO_BTF要生效依赖与if DEBUG_INFO宏的使能
而DEBUG_INFO宏依赖于三个宏的中的任何一个
choice
prompt "Debug information"
depends on DEBUG_KERNEL
help
Selecting something other than "None" results in a kernel image
that will include debugging info resulting in a larger kernel image.
This adds debug symbols to the kernel and modules (gcc -g), and
is needed if you intend to use kernel crashdump or binary object
tools like crash, kgdb, LKCD, gdb, etc on the kernel.
Choose which version of DWARF debug info to emit. If unsure,
select "Toolchain default".
config DEBUG_INFO_NONE
bool "Disable debug information"
help
Do not build the kernel with debugging information, which will
result in a faster and smaller build.
config DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
bool "Rely on the toolchain's implicit default DWARF version"
select DEBUG_INFO
depends on !CC_IS_CLANG || AS_IS_LLVM || CLANG_VERSION < 140000 || (AS_IS_GNU && AS_VERSION >= 23502 && AS_HAS_NON_CONST_LEB128)
help
The implicit default version of DWARF debug info produced by a
toolchain changes over time.
This can break consumers of the debug info that haven't upgraded to
support newer revisions, and prevent testing newer versions, but
those should be less common scenarios.
config DEBUG_INFO_DWARF4
bool "Generate DWARF Version 4 debuginfo"
select DEBUG_INFO
depends on !CC_IS_CLANG || AS_IS_LLVM || (AS_IS_GNU && AS_VERSION >= 23502)
help
Generate DWARF v4 debug info. This requires gcc 4.5+, binutils 2.35.2
if using clang without clang's integrated assembler, and gdb 7.0+.
If you have consumers of DWARF debug info that are not ready for
newer revisions of DWARF, you may wish to choose this or have your
config select this.
config DEBUG_INFO_DWARF5
bool "Generate DWARF Version 5 debuginfo"
select DEBUG_INFO
depends on !CC_IS_CLANG || AS_IS_LLVM || (AS_IS_GNU && AS_VERSION >= 23502 && AS_HAS_NON_CONST_LEB128)
help
Generate DWARF v5 debug info. Requires binutils 2.35.2, gcc 5.0+ (gcc
5.0+ accepts the -gdwarf-5 flag but only had partial support for some
draft features until 7.0), and gdb 8.0+.
Changes to the structure of debug info in Version 5 allow for around
15-18% savings in resulting image and debug info section sizes as
compared to DWARF Version 4. DWARF Version 5 standardizes previous
extensions such as accelerators for symbol indexing and the format
for fission (.dwo/.dwp) files. Users may not want to select this
config if they rely on tooling that has not yet been updated to
support DWARF Version 5.
测试示例
由于eBPF处于快速发展阶段,接口变化比较多,所以网上直接搜到的示例大多无法直接使用,所以这里采用内核提供的一些示例代码进行验证
代码路径:samples/bpf
编译方法
make -C samples/bpf
运行测试示例
$ cd samples/bpf
$ sudo ./cpustat
测试示例编译错误处理
错误1
出错信息如下
/home/lizj/work/linux-stable/samples/bpf/test_lru_dist.c:35:8: 错误:‘struct list_head’重定义 35 | struct list_head {
| ^~~~~~~~~
In file included from /home/lizj/work/linux-stable/samples/bpf/test_lru_dist.c:6:
./tools/include/linux/types.h:84:8: 附注:原先在这里定义 84 | struct list_head {
| ^~~~~~~~~
make[2]: *** [/home/lizj/work/linux-stable/samples/bpf/Makefile.target:58:/home/lizj/work/linux-stable/samples/bpf/test_lru_dist] 错误 1
make[1]: *** [Makefile:2012:/home/lizj/work/linux-stable/samples/bpf] 错误 2
make[1]: 离开目录“/home/lizj/work/linux-stable”
make: *** [Makefile:269:all] 错误 2
make: 离开目录“/home/lizj/work/linux-stable/samples/bpf”
临时解决方案
test_lru_dist.c注释掉35行重复定义,如下
// struct list_head {
// struct list_head *next, *prev;
// };
错误2
出错信息如下
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c: 在函数‘install_accept_all_seccomp’中:
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:24:28: 错误:array type has incomplete element type ‘struct sock_filter’
24 | struct sock_filter filter[] = {
| ^~~~~~
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:25:17: 警告:implicit declaration of function ‘BPF_STMT’; did you mean ‘BPF_STX’? [-Wimplicit-function-declaration]
25 | BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
| ^~~~~~~~
| BPF_STX
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:27:16: 错误:变量‘prog’有初始值设定但类型不完全 27 | struct sock_fprog prog = {
| ^~~~~~~~~~
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:28:18: 错误:‘struct sock_fprog’没有名为‘len’的成员 28 | .len = (unsigned short)ARRAY_SIZE(filter),
| ^~~
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:28:24: 警告:结构初始值设定项中有多余元素 28 | .len = (unsigned short)ARRAY_SIZE(filter),
| ^
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:28:24: 附注:(在‘prog’的初始化附近)
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:29:18: 错误:‘struct sock_fprog’没有名为‘filter’的成员 29 | .filter = filter,
| ^~~~~~
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:29:27: 警告:结构初始值设定项中有多余元素 29 | .filter = filter,
| ^~~~~~
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:29:27: 附注:(在‘prog’的初始化附近)
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:27:27: 错误:‘prog’的存储大小未知 27 | struct sock_fprog prog = {
| ^~~~
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:27:27: 警告:未使用的变量‘prog’ [-Wunused-variable]
/home/lizj/work/linux-stable/samples/bpf/tracex5_user.c:24:28: 警告:未使用的变量‘filter’ [-Wunused-variable]
24 | struct sock_filter filter[] = {
| ^~~~~~
make[2]: *** [/home/lizj/work/linux-stable/samples/bpf/Makefile.target:75:/home/lizj/work/linux-stable/samples/bpf/tracex5_user.o] 错误 1
make[1]: *** [Makefile:2012:/home/lizj/work/linux-stable/samples/bpf] 错误 2
make[1]: 离开目录“/home/lizj/work/linux-stable”
make: *** [Makefile:269:all] 错误 2
make: 离开目录“/home/lizj/work/linux-stable/samples/bpf”
临时解决方案
我们暂时可以不使用这个文件,修改samples/bpf/Makefile
# tprogs-y += tracex5
# tracex5-objs := tracex5_user.o $(TRACE_HELPERS)
注释掉上面两行即可
错误3
出错信息如下
/home/lizj/work/linux-stable/samples/bpf/Makefile:362: *** Cannot find a vmlinux for VMLINUX_BTF at any of " /home/lizj/work/linux-stable/vmlinux", build the kernel or set VMLINUX_BTF or VMLINUX_H variable。 停止。make[1]: *** [Makefile:2012:/home/lizj/work/linux-stable/samples/bpf] 错误 2
make[1]: 离开目录“/home/lizj/work/linux-stable”
make: *** [Makefile:269:all] 错误 2
临时解决方案
编译时采用下面命令
make VMLINUX_BTF=/sys/kernel/btf/vmlinux -C samples/bpf