P4 概述
Programming Protocol-independent Packet Processors (P4) 是网络设备的特定领域语言,指定数据平面设备(交换机、网卡、路由器、过滤器等)如何处理数据包。
P4 工作流程
- P4程序(prog.p4)根据报文头和对传入报文采取的动作(例如,转发、丢弃)对报文进行分类。
- P4编译器生成运行时映射元数据,以允许控制平面和数据平面使用P4Runtime (prog.p4info)进行通信。
- P4编译器还为目标数据平面生成一个可执行文件(target_prog.bin),指定目标设备的头文件格式和相应的操作。
P4 项目组件
BMV2
bmv2(Behavior Model Version 2)是 P4 项目实现的一个 P4 可编程软件交换机。bmv2 并不是一个产品质量级交换机,它只是用于开发人员开发、测试、调试 P4 程序。
P4Runtime
P4 是用于对数据平面进行编程的语言,它定义了数据平面所支持的功能。但是数据平面仍然需要在运行时接收控制平面下发的控制信息,以指导数据平面对现网实现正确的转发行为。而 P4Runtime 就是一套控制平面规范,用于控制网络设备的转发平面。
P4C
p4c 是 P4 编程语言的参考编译器。支持 P4-14 和 P4-16。
P4 实践
开发环境搭建
参考: https://blog.youkuaiyun.com/father_is_/article/details/108225712
编写 P4 程序
/* -*- P4_16 -*- */
#include <core.p4>
#include <v1model.p4>
const bit<16> TYPE_IPV4 = 0x800;
typedef bit<9> egressSpec_t;
typedef bit<48> macAddr_t;
typedef bit<32> ip4Addr_t;
header ethernet_t {
macAddr_t dstAddr;
macAddr_t srcAddr;
bit<16> etherType;
}
header ipv4_t {
bit<4> version;
bit<4> ihl;
bit<8> diffserv;
bit<16> totalLen;
bit<16> identification;
bit<3> flags;
bit<13> fragOffset;
bit<8> ttl;
bit<8> protocol;
bit<16> hdrChecksum;
ip4Addr_t srcAddr;
ip4Addr_t dstAddr;
}
struct metadata {
/* empty */
}
struct headers {
ethernet_t ethernet;
ipv4_t ipv4;
}
parser ParserImpl(packet_in packet,
out headers hdr,
inout metadata meta,
inout standard_metadata_t standard_metadata) {
state start {
transition parse_ethernet;
}
state parse_ethernet {
packet.extract(hdr.ethernet);
transition select(hdr.ethernet.etherType) {
TYPE_IPV4: parse_ipv4;
default: accept;
}
}
state parse_ipv4 {
packet.extract(hdr.ipv4);
transition accept;
}
}
control VerifyChecksumImpl(inout headers hdr, inout metadata meta) {
apply { }
}
control IngressImpl(inout headers hdr,
inout metadata meta,
inout standard_metadata_t standard_metadata) {
action drop() {
mark_to_drop(standard_metadata);
}
action ipv4_forward(egressSpec_t port) {
standard_metadata.egress_spec = port;
hdr.ipv4.ttl = hdr.ipv4.ttl - 1;
}
table ipv4_lpm {
key = {
hdr.ipv4.dstAddr: lpm;
}
actions = {
ipv4_forward;
drop;
NoAction;
}
size = 512;
default_action = NoAction();
}
apply {
if (hdr.ipv4.isValid()) {
ipv4_lpm.apply();
}
}
}
control EgressImpl(inout headers hdr,
inout metadata meta,
inout standard_metadata_t standard_metadata) {
apply { }
}
control ComputeChecksumImpl(inout headers hdr, inout metadata meta) {
apply {
update_checksum(
hdr.ipv4.isValid(),
{ hdr.ipv4.version,
hdr.ipv4.ihl,
hdr.ipv4.diffserv,
hdr.ipv4.totalLen,
hdr.ipv4.identification,
hdr.ipv4.flags,
hdr.ipv4.fragOffset,
hdr.ipv4.ttl,
hdr.ipv4.protocol,
hdr.ipv4.srcAddr,
hdr.ipv4.dstAddr },
hdr.ipv4.hdrChecksum,
HashAlgorithm.csum16);
}
}
control DeparserImpl(packet_out packet, in headers hdr) {
apply {
packet.emit(hdr.ethernet);
packet.emit(hdr.ipv4);
}
}
V1Switch(
ParserImpl(),
VerifyChecksumImpl(),
IngressImpl(),
EgressImpl(),
ComputeChecksumImpl(),
DeparserImpl()
) main;
将文件命名为 demo.p4
执行编译
p4c -b bmv2 -o build demo.p4
-b:指定 target
-o:指定输出路径
编译成功后会在 build 文件夹下看到 demo.json 和 demo.p4i 这两个文件
构建网络拓扑
-
创建虚拟 veth pair 接口,同时禁用该接口上的 IPv6,防止对后面测试产生干扰
sudo ip link add name veth0 type veth peer name veth1 sudo ip link set dev veth0 up sudo ip link set dev veth1 up sudo sysctl net.ipv6.conf.veth0.disable_ipv6=1 sudo sysctl net.ipv6.conf.veth1.disable_ipv6=1 sudo ip link add name veth2 type veth peer name veth3 sudo ip link set dev veth2 up sudo ip link set dev veth3 up sudo sysctl net.ipv6.conf.veth2.disable_ipv6=1 sudo sysctl net.ipv6.conf.veth3.disable_ipv6=1 sudo ip link add name veth4 type veth peer name veth5 sudo ip link set dev veth4 up sudo ip link set dev veth5 up sudo sysctl net.ipv6.conf.veth4.disable_ipv6=1 sudo sysctl net.ipv6.conf.veth5.disable_ipv6=1
-
启动 simple_switch 交换机
cyquen@cyquen-virtual-machine:~/P4/demo$ sudo simple_switch --interface 0@veth0 --interface 1@veth2 --interface 2@veth4 build/demo.json & [1] 2089 cyquen@cyquen-virtual-machine:~/P4/demo$ Calling target program-options parser Adding interface veth0 as port 0 Adding interface veth2 as port 1 Adding interface veth4 as port 2
向交换机的路由表中下发路由
接下来将通过 simple_switch_CLI
命令来控制该 P4 软件交换机
cyquen@cyquen-virtual-machine:~/P4/demo$ simple_switch_CLI
Obtaining JSON from switch...
Done
Control utility for runtime P4 table manipulation
RuntimeCmd:
使用 show tables
命令查看当前所有表,使用 table_info
命令查看指定表的具体信息:
RuntimeCmd: show_tables
IngressImpl.ipv4_lpm [implementation=None, mk=ipv4.dstAddr(lpm, 32)]
RuntimeCmd: table_info ipv4_lpm
IngressImpl.ipv4_lpm [implementation=None, mk=ipv4.dstAddr(lpm, 32)]
********************************************************************************
IngressImpl.drop []
IngressImpl.ipv4_forward [port(9)]
NoAction []
RuntimeCmd:
使用 table_add
命令向交换机添加路由,这里假设:
- port 0(veth 0)连接 10.0.0.0/8 网段
- port 1(veth 2)连接 20.0.0.0/8 网段
- port 2(veth 4)连接 30.0.0.0/8 网段
RuntimeCmd: table_add ipv4_lpm ipv4_forward 10.0.0.0/8 => 0
Adding entry to lpm match table ipv4_lpm
match key: LPM-0a:00:00:00/8
action: ipv4_forward
runtime data: 00:00
Entry has been added with handle 0
RuntimeCmd: table_add ipv4_lpm ipv4_forward 20.0.0.0/8 => 1
Adding entry to lpm match table ipv4_lpm
match key: LPM-14:00:00:00/8
action: ipv4_forward
runtime data: 00:01
Entry has been added with handle 1
RuntimeCmd: table_add ipv4_lpm ipv4_forward 30.0.0.0/8 => 2
Adding entry to lpm match table ipv4_lpm
match key: LPM-1e:00:00:00/8
action: ipv4_forward
runtime data: 00:02
Entry has been added with handle 2
使用 table_dump
查看添加的表项:
RuntimeCmd: table_dump ipv4_lpm
==========
TABLE ENTRIES
**********
Dumping entry 0x0
Match key:
* ipv4.dstAddr : LPM 0a000000/8
Action entry: IngressImpl.ipv4_forward - 00
**********
Dumping entry 0x1
Match key:
* ipv4.dstAddr : LPM 14000000/8
Action entry: IngressImpl.ipv4_forward - 01
**********
Dumping entry 0x2
Match key:
* ipv4.dstAddr : LPM 1e000000/8
Action entry: IngressImpl.ipv4_forward - 02
==========
Dumping default entry
Action entry: NoAction -
==========
测试交换机的三层转发
使用 scapy 工具从 veth1 注入报文,然后分别在 veth3 和 veth 5 上使用 tcpdump 抓包:
cyquen@cyquen-virtual-machine:~$ sudo scapy
[sudo] password for cyquen:
INFO: Can't import matplotlib. Won't be able to plot.
INFO: Can't import PyX. Won't be able to use psdump() or pdfdump().
WARNING: No route found for IPv6 destination :: (no default route?)
aSPY//YASa
apyyyyCY//YCa |
sY//YSpcs scpCY//Pp | Welcome to Scapy
ayp ayyyyyyySCP//Pp syY//C | Version 2.4.3
AYAsAYYYYYYYY///Ps cY//S |
pCCCCY//p cSSps y//Y | https://github.com/secdev/scapy
SPPPP///a pP///AC//Y |
A//A cyPC | Have fun!
p///Ac sC///a |
PYCpc A//A | Wanna support scapy? Rate it on
scccccp///pSP///p p//Y | sectools!
sY/y caa S//P | http://sectools.org/tool/scapy/
cayCyayP//Ya pY/Ya | -- Satoshi Nakamoto
sY/PsYYCc aC//Yp |
sc sccaCY//PCypaapyCP//YSs
spCPY//YPSps
ccaacs
using IPython 7.13.0
>>>
cyquen@cyquen-virtual-machine:~$ sudo tcpdump -n -i veth3
[sudo] password for cyquen:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth3, link-type EN10MB (Ethernet), capture size 262144 bytes
cyquen@cyquen-virtual-machine:~$ sudo tcpdump -n -i veth5
[sudo] password for cyquen:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth5, link-type EN10MB (Ethernet), capture size 262144 bytes
-
从 veth1 注入目的地为 20.0.0.1 的报文,veth 3 收到报文,而 veth 5 没有收到任何报文
>>> p = Ether()/IP(dst="20.0.0.1")/UDP() >>> sendp(p, iface="veth1") . Sent 1 packets.
cyquen@cyquen-virtual-machine:~$ sudo tcpdump -n -i veth3 [sudo] password for cyquen: tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on veth3, link-type EN10MB (Ethernet), capture size 262144 bytes 16:38:40.238448 IP 192.168.99.132.53 > 20.0.0.1.53: domain [length 0 < 12] (invalid)
-
从 veth1 注入目的地为 30.0.0.1 的报文,veth 5 收到报文,而 veth 3 没有收到任何报文
>>> p = Ether()/IP(dst="30.0.0.1")/UDP() >>> sendp(p, iface="veth1") . Sent 1 packets.
cyquen@cyquen-virtual-machine:~$ sudo tcpdump -n -i veth5 [sudo] password for cyquen: tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on veth5, link-type EN10MB (Ethernet), capture size 262144 bytes 16:42:11.133833 IP 192.168.99.132.53 > 30.0.0.1.53: domain [length 0 < 12] (invalid)
可以看到,自己编写的 P4 程序的 simple_switch 按照预期执行了 LPM 转发。
结束!
参考
[1] https://p4.org/
[2] https://www.sdnlab.com/24136.html