skb - Linux network buffers

1. Introduction

At the time I wanted to know more about the Linux network stack, I always wanted a document like this to exist. But unfortunately I never found one. After I gained some basic knowledge about the Linux network stack internals, I wrote one.

I'm happy if this document is of any use for other people trying to learn about the Linux kernel.

Please let me know of any bugs in this document. It should resemble kernel revision 2.4.0-test4

2. skbuff's

skbuffs are the buffers in which the linux kernel handles network packets. The packet is received by the network card, put into a skbuff and then passed to the network stack, which uses the skbuff all the time.

2.1 struct sk_buff

The struct sk_buff is defined in <linux/skbuff.h> as follows:

next

next buffer in list

prev

previous buffer in list

list

list we are on

sk

socket we belong to

stamp

timeval we arrived at

dev

device we are leaving by

rx_dev

device we arrived at

h

transport layer header (tcp,udp,icmp,igmp,spx,raw)

nh

network layer header (ip,ipv6,arp,ipx,raw)

mac

link layer header

dst

FIXME:

cb

control buffer, used internally

len

length of actual data

csum

checksum

used

FIXME: data moved to user and not MSG_PEEK

is_clone

we are a clone

cloned

head may be cloned

pkt_type

packet class

ip_summed

driver fed us ip checksum

priority

packet queuing priority

users

user count

protocol

packet protocol from driver

security

security level of packet

truesize

real size of the buffer

head

pointer to head of buffer

data

data head pointer

tail

tail pointer

end

end pointer

destructor

destructor function

nfmark

netfilter mark

nfcache

netfilter internal caching info

nfct

associated connection, if any

tc_index

traffic control index

2.2 skb support functions

There are a bunch of skb support functions provided by the sk_buff layer. I briefly describe the most important ones in this section.

allocation / free / copy / clone and expansion functions

struct sk_buff *alloc_skb(unsigned int size, int gfp_mask)

This function allocates a new skb. This is provided by the skb layer to initialize some privat data and do memory statistics. The returned buffer has no headroom and a tailroom of /size/ bytes.

void kfree_skb(struct sk_buff *skb)

Decrement the skb's usage count by one and free the skb if no references left.

struct sk_buff *skb_get(struct sk_buff *skb)

Increments the skb's usage count by one and returns a pointer to it.

struct sk_buff *skb_clone(struct sk_buff *skb, int gfp_mask)

This function clones a skb. Both copies share the packet data but have their own struct sk_buff. The new copy is not owned by any socket, reference count is 1.

struct sk_buff *skb_copy(const struct sk_buff *skb, int gfp_mask)

Makes a real copy of the skb, including packet data. This is needed, if You wish to modify the packet data. Reference count of the new skb is 1.

struct skb_copy_expand(const struct sk_buff *skb, int new_headroom, int new_tailroom, int gfp_mask)

Make a copy of the skb, including packet data. Additionally the new skb has a haedroom of /new_headroom/ bytes size and a tailroom of /new_tailroom/ bytes.

anciliary functions

int skb_cloned(struct sk_buff *skb)

Is the skb a clone?

int skb_shared(struct sk_Buff *skb)

Is this skb shared? (is the reference count > 1)?

operations on lists of skb's

struct sk_buff *skb_peek(struct sk_buff_head *list_)

peek a skb from front of the list; does not remove skb from the list

struct sk_buff *skb_peek_tail(struct sk_buff_head *list_)

peek a skb from tail of the list; does not remove sk from the list

__u32 skb_queue_len(sk_buff_head *list_)

return the length of the given skb list

void skb_queue_head(struct sk_buff_head *list_, struct sk_buff *newsk)

enqueue a skb at the head of a given list

void skb_queue_tail(struct sk_buff_head *list_, struct sk_buff *newsk)

enqueue a skb at the end of a given list.

struct sk_buff *skb_dequeue(struct sk_buff_head *list_)

dequeue a skb from the head of the given list.

struct sk_buff *sbk_dequeue_tail(struct sk_buff_head *list_)

dequeue a skb from the tail of the given list

operations on skb data

unsigned char *skb_put(struct sk_buff *sbk, int len)

extends the data area of the skb. if the total size exceeds the size of the skb, the kernel will panic. A pointer to the first byte of new data is returned.

unsigned char *skb_push(struct sk_buff *skb, int len)

extends the data area of the skb. if the total size exceeds the size of the skb, the kernel will panic. A pointer to the first byte of new data is returned.

unsigned char *skb_pull(struct sk_buff *skb, int len)

remove data from the start of a buffer, returning the bytes to headroom. A pointr to the next data in the buffer is returned.

int skb_headroom(struct sk_buff *skb)

return the amount of bytes of free space at the head of skb

int skb_tailroom(struct sk_buff *skb)

return the amount of bytes of free space at the end of skb

struct sk_buff *skb_cow(struct sk_buff *skb, int headroom)

if the buffer passed lacks sufficient headroom or is a clone it is copied and additional headroom made available.

<think>我们正在处理一个关于Linux内核网络栈中sk_buff结构体操作的问题,特别是如何正确设置传输层头部指针(transport_header)。根据用户提供的引用和问题描述,我们需要参考内核中处理sk_buff的常见做法。 在Linux内核中,sk_buff(简称skb)是网络数据包的核心数据结构。它包含数据包在不同网络层的头部信息。skb中有几个重要的指针用于指向不同层的头部: - `mac_header`:指向链路层头部(MAC头) - `network_header`:指向网络层头部(IP头) - `transport_header`:指向传输层头部(如TCP、UDP头) 当我们在内核中处理一个数据包时,通常需要设置这些指针以便于访问各层头部。特别是,当我们构建一个数据包或者修改一个数据包时,需要正确设置这些指针。 根据引用[2]中的示例,我们看到在较旧的内核版本中,访问IP头是通过`skb->nh.iph`(这是一个联合体)来实现的。但在新内核中,推荐使用`skb_network_header(skb)`来访问,并且设置网络层头部指针应该使用`skb_set_network_header`函数。同样,传输层头部指针也有相应的函数。 现在,用户的问题是如何正确设置`transport_header`。根据内核的最佳实践,我们应该使用内核提供的辅助函数来设置这些指针。具体步骤如下: 1. 首先,确保我们已经正确设置了网络层头部(IP头)的位置。因为传输层头部是在网络层头部之后,所以我们需要知道IP头的长度。 2. 然后,使用`skb_set_transport_header`函数来设置传输层头部指针。这个函数需要两个参数:skb指针和传输层头部相对于skb->data的偏移量。 例如,假设我们有一个skb,其网络层头部已经设置好(即`skb_network_header(skb)`指向IP头),我们可以通过以下方式设置传输层头部: ```c struct iphdr *ip_header = ip_hdr(skb); // 获取IP头 if (unlikely(!ip_header)) return -EINVAL; // 计算传输层头部的偏移:当前网络层头部的偏移加上IP头的长度(注意:IP头长度是以4字节为单位,需要乘以4) int transport_offset = skb_network_offset(skb) + (ip_header->ihl * 4); skb_set_transport_header(skb, transport_offset); ``` 但是,请注意,在设置传输层头部之前,我们需要确保数据包有足够的空间(即从skb->data开始到传输层头部的数据都是有效的)。另外,我们也可以使用`skb_transport_header`函数来获取传输层头部指针,但前提是已经设置好了。 另外,根据引用[3]提到的`net/ipv4/tcp_input.c`,内核在处理接收到的TCP数据包时,会确保这些指针被正确设置。因此,我们可以参考内核中类似的处理流程。 值得注意的是,在内核中,当我们通过`netif_receive_skb`等函数将数据包传递到上层时,网络栈会逐层解析并设置这些指针。如果我们是在构造一个新的数据包(比如在Netfilter钩子中),则需要手动设置这些指针。 总结步骤: - 首先,设置网络层头部(使用`skb_set_network_header`)。 - 然后,根据网络层头部的信息(如IP头的长度)计算出传输层头部的偏移,再用`skb_set_transport_header`设置。 注意:在设置之前,要确保数据包有足够的线性空间(可以使用`skb_headroom`检查,或者使用`skb_make_writable`确保可写)。 下面是一个示例代码片段,展示如何基于已有的IP头(通过参数ip_hdr给出)来设置传输层头部: ```c // 假设skb是我们要操作的sk_buff,并且ip_hdr已经指向skb中有效的IP头 // 首先,设置网络层头部(如果还没有设置的话) // 这里假设ip_hdr是通过skb->data偏移得到的,我们需要计算出网络层头部的偏移 unsigned char *ip_hdr = ...; // 指向IP头的指针 int network_offset = ip_hdr - skb->data; skb_set_network_header(skb, network_offset); // 然后,计算传输层头部的偏移:网络层偏移加上IP头长度 struct iphdr *iph = (struct iphdr *)ip_hdr; int transport_offset = network_offset + (iph->ihl * 4); skb_set_transport_header(skb, transport_offset); ``` 这样,我们就可以通过`skb_transport_header(skb)`来访问传输层头部了。 注意:以上代码适用于IPv4。对于IPv6,需要使用对应的IPv6头结构(struct ipv6hdr)和字段(注意IPv6头是固定40字节,没有IHL字段)。 最后,我们需要确保在设置这些指针后,数据包的其他部分(如校验和)也被正确更新,因为修改头部可能会使校验和失效。 相关问题: 1. 在设置传输层头部指针后,如何更新校验和? 2. 如果数据包是分片的,该如何处理传输层头部? 3. 在Netfilter钩子中修改skb的传输层头部时需要注意什么?</think>### 如何基于 ip_hdr 正确设置 skb 的 transport_header 在 Linux 内核网络栈中,正确设置 `sk_buff` 的传输层头部指针 (`transport_header`) 需要遵循以下步骤: #### 关键步骤 1. **获取 IP 头部信息** 使用 `skb_network_header(skb)` 获取网络层头部指针: ```c struct iphdr *ip_header = ip_hdr(skb); // 等同于 (struct iphdr *)skb_network_header(skb) ``` 2. **计算传输层偏移量** 基于 IP 头长度计算传输层头部偏移: ```c int transport_offset = skb_network_offset(skb) + (ip_header->ihl * 4); ``` - `skb_network_offset(skb)`:网络层头部相对 `skb->data` 的偏移 - `ip_header->ihl`:IP 头长度(以 4 字节为单位) 3. **设置 transport_header** 使用内核提供的安全设置函数: ```c skb_set_transport_header(skb, transport_offset); ``` #### 完整示例代码 ```c #include <linux/skbuff.h> #include <linux/ip.h> void setup_transport_header(struct sk_buff *skb) { // 确保网络层头部已设置 if (!skb_network_header(skb)) return; // 获取 IP 头部 struct iphdr *ip_header = ip_hdr(skb); // 计算传输层偏移 int transport_offset = skb_network_offset(skb) + (ip_header->ihl * 4); // 设置传输层头部指针 skb_set_transport_header(skb, transport_offset); } ``` #### 注意事项 1. **有效性检查** 操作前需验证指针有效性: ```c if (unlikely(!skb || !skb_network_header(skb))) return NF_ACCEPT; // 或错误处理 [^2] ``` 2. **分片数据包** 对于分片包(`ip_header->frag_off` 非零),需先重组才能设置传输层头部 [^4]。 3. **协议兼容性** 使用 `ip_header->protocol` 检查传输层协议(如 TCP/UDP): ```c if (ip_header->protocol == IPPROTO_TCP) { struct tcphdr *tcp_header = tcp_hdr(skb); // 仅在 transport_header 设置后有效 } ``` 4. **指针访问函数** - 获取 TCP 头:`tcp_hdr(skb)` - 获取 UDP 头:`udp_hdr(skb)` - 这些宏依赖正确设置的 `skb->transport_header` #### 底层原理 - `transport_header` 指向传输层头部起始位置(如 TCP/UDP 头) - 设置后可通过 `skb_transport_header(skb)` 直接访问 - 内核协议栈(如 `tcp_input.c`)依赖此指针解析传输层数据 [^3]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值