【eBPF】使用bcc构建tracepoint程序

前言

Tracepoint Programs是一类典型的eBPF程序。

《Linux Observability with BPF》一书中这样介绍Tracepoint Programs:

This type of program allows you to attach BPF programs to the tracepoint handler provided by the kernel. Tracepoint programs are defined with the type BPF_PROG_TYPE_TRACEPOINT. As you’ll see in Chapter 4, tracepoints are static marks in the kernel’s codebase that allow you to inject arbitrary code for tracing and debugging purposes. They are less flexible than kprobes, because they need to be defined by the kernel beforehand, but they are guaranteed to be stable after their introduction in the kernel. This gives you a much higher level of predictability when you want to debug your system.

简而言之,Trace point程序是一类不如kprobe灵活,但引入内核后保证稳定的程序。所有的trace point点需要由内核事先定义,所有的跟踪点都在/sys/kernel/debug/tracing/events中定义。

接下来,我们使用BCC构建一个Trace Point程序。

BCC程序与简要解析

程序来自bcc的官方例子:examples/tracing/urandomread.py

from __future__ import print_function
from bcc import BPF

# load BPF program
b = BPF(text="""
TRACEPOINT_PROBE(random, urandom_read) {
    // args is from /sys/kernel/debug/tracing/events/random/urandom_read/format
    bpf_trace_printk("%d\\n", args->got_bits);
    return 0;
}
""")

# header
print("%-18s %-16s %-6s %s" % ("TIME(s)", "COMM", "PID", "GOTBITS"))

# format output
while 1:
    try:
        (task, pid, cpu, flags, ts, msg) = b.trace_fields()
    except ValueError:
        continue
    print("%-18.9f %-16s %-6d %s" % (ts, task, pid, msg))

  1. TRACEPOINT_PROBE(random, urandom_read): 启动内核跟踪点。根据目录/random/urandom_read输入参数。
  2. args->got_bits: 自动生成的参数,每个event的参数各有不同,见下文。

运行结果示例:
在这里插入图片描述
urandomread的参数设置如下:

输入

cat /sys/kernel/debug/tracing/events/random/urandom_read/format

得到:

name: urandom_read
ID: 1239
format:
	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
	field:int common_pid;	offset:4;	size:4;	signed:1;

	field:int got_bits;	offset:8;	size:4;	signed:1;
	field:int pool_left;	offset:12;	size:4;	signed:1;
	field:int input_left;	offset:16;	size:4;	signed:1;

print fmt: "got_bits %d nonblocking_pool_entropy_left %d input_entropy_left %d", REC->got_bits, REC->pool_left, REC->input_left

从最后一行可以看到,可以输出go_bits,pool_left,input_left三个参数。

Trace Point 举一反三

由官方例子可以得到构建一个trace point程序的流程:

首先,查看想要跟踪的event的输出格式。此处以/tcp/tcp_receive_reset跟踪点为例:

cat /sys/kernel/debug/tracing/events/tcp/tcp_receive_reset/format

得到:

name: tcp_receive_reset
ID: 1467
format:
	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
	field:int common_pid;	offset:4;	size:4;	signed:1;

	field:const void * skaddr;	offset:8;	size:8;	signed:0;
	field:__u16 sport;	offset:16;	size:2;	signed:0;
	field:__u16 dport;	offset:18;	size:2;	signed:0;
	field:__u8 saddr[4];	offset:20;	size:4;	signed:0;
	field:__u8 daddr[4];	offset:24;	size:4;	signed:0;
	field:__u8 saddr_v6[16];	offset:28;	size:16;	signed:0;
	field:__u8 daddr_v6[16];	offset:44;	size:16;	signed:0;
	field:__u64 sock_cookie;	offset:64;	size:8;	signed:0;

print fmt: "sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c sock_cookie=%llx", REC->sport, REC->dport, REC->saddr, REC->daddr, REC->saddr_v6, REC->daddr_v6, REC->sock_cookie

此处选择sport这一参数输出,若想输出多个参数,可以使用PERF结构,本文为了方便仍使用printk方法进行输出。

其次,构建相应的bcc程序:

#!/usr/bin/python
#
# urandomread  Example of instrumenting a kernel tracepoint.
#              For Linux, uses BCC, BPF. Embedded C.
#
# REQUIRES: Linux 4.7+ (BPF_PROG_TYPE_TRACEPOINT support).
#
# Test by running this, then in another shell, run:
#     dd if=/dev/urandom of=/dev/null bs=1k count=5
#
# Copyright 2016 Netflix, Inc.
# Licensed under the Apache License, Version 2.0 (the "License")

from __future__ import print_function
from bcc import BPF
from bcc.utils import printb

# load BPF program
b = BPF(text="""
TRACEPOINT_PROBE(tcp, tcp_receive_reset) {
    // args is from /sys/kernel/debug/tracing/events/tcp/tcp_receive_reset/format
    bpf_trace_printk("%u\\n", args->sport);
    return 0;
}
""")

# header
print("%-18s %-16s %-6s %s" % ("TIME(s)", "COMM", "PID", "sport"))

# format output
while 1:
    try:
        (task, pid, cpu, flags, ts, msg) = b.trace_fields()
    except ValueError:
        continue
    except KeyboardInterrupt:
        exit()
    printb(b"%-18.9f %-16s %-6d %s" % (ts, task, pid, msg))

最后,运行程序:
在这里插入图片描述
成功运行。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值