OVN架构

OVN

OVN(Open Virtual Network)是一个支持逻辑网络的系统,有逻辑交换机,用来做二层转发,二到四层的 ACL做访问控制。分布式逻辑路由器用来做三层转发。支持多种隧道封装技术,有Geneve,STT 和 VXLAN。支持使用硬件TOR交换机连接物理网络和虚拟网络。

OVN的2种角色:

OVN中心节点:只有一台主机承担这个角色。运行着OVN 北向数据库和OVN南向数据库。
OVN计算节点:所有提供虚拟机的主机节点。运行着ovn-controller。

OVN架构
CMS云管理系统,ovn的客户端
OVN Plugin使用 OVSDB 协议来把用户的配置写在 NorthDB 里
ovn north DB用于描述上层的逻辑网络组件,比如逻辑交换机、路由器、ACL、逻辑端口
ovn-northd监听到 Northbound DB 配置发生改变把logical switch和logical router翻译成logical flow,flow分为ingress和egress
ovn south DB逻辑网络,物理网络,以及二者的对应关系,逻辑网络由ovn-northd写,物理网络和二者对应关系由ovn-controller写物理网络数据,比如VM的IP地址和隧道封装格式,逻辑网络数据,比如报文转发方式,物理网络和逻辑网络的绑定关系,比如逻辑端口关联到哪个 HV 上面。
ovn-controller每个主机上运行一个,监控到SouthDB数据的发生变化之后下发ovs流表,下发成功后同步到SouthDB,每个ovn计算节点都会在ovn-controller启动之前创建一个br-int桥,VM直接连接到br-int桥上,ovn-controller还会在br-int上添加tunnel口用于和其他主机通信,如果要和物理网络打通,还需要单独创建一个bridge,bridge中绑定一个主机上的物理口,物理口连接外部物理网络,br-int bridge和这个bridge用patch port互连。
                                     CMS
                                      |
                                      |
                          +-----------|-----------+
                          |           |           |
                          |     OVN/CMS Plugin    |
                          |           |           |
                          |           |           |
                          |   OVN Northbound DB   |
                          |           |           |
                          |           |           |
                          |       ovn-northd      |
                          |           |           |
                          +-----------|-----------+
                                      |
                                      |
                            +-------------------+
                            | OVN Southbound DB |
                            +-------------------+
                                      |
                                      |
                   +------------------+------------------+
                   |                  |                  |
     HV 1          |                  |    HV n          |
   +---------------|---------------+  .  +---------------|---------------+
   |               |               |  .  |               |               |
   |        ovn-controller         |  .  |        ovn-controller         |
   |         |          |          |  .  |         |          |          |
   |         |          |          |     |         |          |          |
   |  ovs-vswitchd   ovsdb-server  |     |  ovs-vswitchd   ovsdb-server  |
   |                               |     |                               |
   +-------------------------------+     +-------------------------------+
OVN流程分析:

1、当CMS更新北向数据库配置时,作为事务的一部分
2、当北向数据库收到数据时同步南北向数据库时放在一个事务中,这样可以始终保持南北数据库内容一致,
3、当南向数据库写入后会更新状态到北向数据库,这样通过北向数据库可以实时看到南北数据库同步情况。
4、到ovn-controller收到南向数据库内容时,下发OpenvSwitch流表,下发成功后会同步到南向数据库更新配置信息。
5、ovn-northd监测到南向数据库记录后复制到北向数据库,这样CMS可以追踪南北向数据一致性。

Neutron:

Neutron是OpenStack项目中负责提供网络服务的组件,它基于sdn的思想,实现了网络虚拟化下的资源管理。

Neutron框架
Neutron Server对外提供OpenStack网络API,接收请求,并调用Plugin处理请求。
Plugin处理Neutron Server发来的请求,维护OpenStack逻辑网络的状态,并调用Agent处理请求。
Agent处理Plugin的请求,负责在Network Provider上真正实现各种网络功能。
Network Provider提供网络服务的虚拟或者物理网络设备,比如Linux Bridge,OpenVSwitch或者其他支持Neutron的物理交换机。
QueueNeutron Server,Plugin和Agent之间通过Messaging Queue通信和调用。
Database存放OpenStack的网络状态信息,包括Network,Subnet,Port,Router等。
Neutron流程分析:

1、Neutron Server收到请求调用Plugin处理
2、Plugin收到请求后把网络数据保存到Database后,将数据发送到Queue里等待Agent消费
3、Agent读取Queue里的数据调用Network Provider实现具体功能

OVN VS Neutron

1、OVN 里面数据的读写都是通过OVSDB来做的,取代了 Neutron 的消息队列机制
2、Neutron 的三层功能是在网关节点做的,所有东西向跨网段的流量都需要经过网络节点做路由,这使得网关节点成为瓶颈。有了 DVR 之后,路由变成了分布式,每个计算节点上面都可以做路由,东西向流量直接通过计算节点路由而不需要经过网络节点,减轻了网关节点的负担

OVN演示

1、初始化时,北向数据库为空,南向数据库里记录物理网络数据,每个chassis对应一个node

ovn-nbctl show

ovn-sbctl show
Chassis "f07d9084-dd4e-4499-9605-314512a85c03"
hostname: ovn-worker
Encap geneve
ip: "172.18.0.4"
options: {csum="true"}
Chassis "8eec2547-ad18-49fc-9bac-f97e737b224b"
hostname: ovn-worker2
Encap geneve
ip: "172.18.0.3"
options: {csum="true"}
Chassis "62f3cdd0-1477-4d3a-a3c4-df8ba338e22c"
hostname: ovn-control-plane
Encap geneve
ip: "172.18.0.2"
options: {csum="true"}

2、ovn创建一个交换机abc、路由器lr和端口

ovn-nbctl ls-add abc
ovn-nbctl lsp-add abc abc-vm1
ovn-nbctl lsp-set-addresses abc-vm1 00:00:00:00:00:01
ovn-nbctl lsp-add abc abc-vm2
ovn-nbctl lsp-set-addresses abc-vm2 00:00:00:00:00:02

# 在交换机上创建patch port连接ls和lr
ovn-nbctl lsp-add abc ls-lr
ovn-nbctl lsp-set-addresses ls-lr "00:00:00:00:00:03"
ovn-nbctl lsp-set-type ls-lr router
ovn-nbctl lsp-set-options ls-lr router-port=lr-ls

# 在路由器上创建patch port连接ls和lr
ovn-nbctl lr-add lr
ovn-nbctl lrp-add lr lr-ls 00:00:00:00:00:03 10.10.10.3/24
ovn-nbctl set Logical_Router lr options:chassis=f07d9084-dd4e-4499-9605-314512a85c03

# 在北向数据库看到逻辑交换机和逻辑路由器,和连接交换机和路由器的patch port
ovn-nbctl show
switch eb3a549f-ecd9-4802-b691-a2b9dd342206 (abc)
port ls-lr
type: router
addresses: ["00:00:00:00:00:03"]
router-port: lr-ls
port abc-vm1
addresses: ["00:00:00:00:00:01"]
port abc-vm2
addresses: ["00:00:00:00:00:02"]
router cc8abc0d-c1ce-441d-a8c0-dcbdb3c192cf (lr)
port lr-ls
mac: "00:00:00:00:00:03"
networks: ["10.10.10.3/24"]

3、创建ns模拟vm连接到ovn端口abc-vm1

# 在ovn-worker主机:
ip netns add vm1
ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal
ip link set vm1 netns vm1
ip netns exec vm1 ip link set vm1 address 00:00:00:00:00:01
ip netns exec vm1 ip addr add 10.10.10.1/24 dev vm1
ip netns exec vm1 ip link set vm1 up
ovs-vsctl set Interface vm1 external_ids:iface-id=abc-vm1

# 在ovn-worker2主机上
ip netns add vm2
ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal
ip link set vm2 netns vm2
ip netns exec vm2 ip link set vm2 address 00:00:00:00:00:02
ip netns exec vm2 ip addr add 10.10.10.2/24 dev vm2
ip netns exec vm2 ip link set vm2 up
ovs-vsctl set Interface vm2 external_ids:iface-id=abc-vm2

# 查看南向数据库看到ovn-worker物理网络和逻辑网络的绑定关系
ovn-sbctl show
Chassis "f07d9084-dd4e-4499-9605-314512a85c03"
hostname: ovn-worker
Encap geneve
ip: "172.18.0.4"
options: {csum="true"}
Port_Binding abc-vm1
Chassis "8eec2547-ad18-49fc-9bac-f97e737b224b"
hostname: ovn-worker2
Encap geneve
ip: "172.18.0.3"
options: {csum="true"}
Port_Binding abc-vm2
Chassis "62f3cdd0-1477-4d3a-a3c4-df8ba338e22c"
hostname: ovn-control-plane
Encap geneve
ip: "172.18.0.2"
options: {csum="true"}
ovn-sbctl lflow-list abc
Datapath: "abc" (e7b55cb9-4ace-4fed-b1bf-798d2853a039)  Pipeline: ingress
table=0 (ls_in_port_sec_l2  ), priority=100  , match=(eth.src[40]), action=(drop;)
table=0 (ls_in_port_sec_l2  ), priority=100  , match=(vlan.present), action=(drop;)
table=0 (ls_in_port_sec_l2  ), priority=50   , match=(inport == "abc-vm1" && eth.src == {00:00:00:00:00:01}), action=(next;)
table=0 (ls_in_port_sec_l2  ), priority=50   , match=(inport == "abc-vm2"), action=(next;)
table=1 (ls_in_port_sec_ip  ), priority=0    , match=(1), action=(next;)
table=2 (ls_in_port_sec_nd  ), priority=90   , match=(inport == "abc-vm1" && eth.src == 00:00:00:00:00:01 && arp.sha == 00:00:00:00:00:01), action=(next;)
table=2 (ls_in_port_sec_nd  ), priority=90   , match=(inport == "abc-vm1" && eth.src == 00:00:00:00:00:01 && ip6 && nd && ((nd.sll == 00:00:00:00:00:00 || nd.sll == 00:00:00:00:00:01) || ((nd.tll == 00:00:00:00:00:00 || nd.tll == 00:00:00:00:00:01)))), action=(next;)
table=2 (ls_in_port_sec_nd  ), priority=80   , match=(inport == "abc-vm1" && (arp || nd)), action=(drop;)
table=2 (ls_in_port_sec_nd  ), priority=0    , match=(1), action=(next;)
table=3 (ls_in_lookup_fdb   ), priority=0    , match=(1), action=(next;)
table=4 (ls_in_put_fdb      ), priority=0    , match=(1), action=(next;)
table=5 (ls_in_pre_acl      ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
table=5 (ls_in_pre_acl      ), priority=0    , match=(1), action=(next;)
table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.mcast), action=(next;)
table=6 (ls_in_pre_lb       ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
table=6 (ls_in_pre_lb       ), priority=0    , match=(1), action=(next;)
table=7 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && ip4 && sctp), action=(reg1 = ip4.dst; reg2[0..15] = sctp.dst; ct_lb;)
table=7 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && ip4 && tcp), action=(reg1 = ip4.dst; reg2[0..15] = tcp.dst; ct_lb;)
table=7 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && ip4 && udp), action=(reg1 = ip4.dst; reg2[0..15] = udp.dst; ct_lb;)
table=7 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && ip6 && sctp), action=(xxreg1 = ip6.dst; reg2[0..15] = sctp.dst; ct_lb;)
table=7 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && ip6 && tcp), action=(xxreg1 = ip6.dst; reg2[0..15] = tcp.dst; ct_lb;)
table=7 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && ip6 && udp), action=(xxreg1 = ip6.dst; reg2[0..15] = udp.dst; ct_lb;)
table=7 (ls_in_pre_stateful ), priority=110  , match=(reg0[2] == 1), action=(ct_lb;)
table=7 (ls_in_pre_stateful ), priority=100  , match=(reg0[0] == 1), action=(ct_next;)
table=7 (ls_in_pre_stateful ), priority=0    , match=(1), action=(next;)
table=8 (ls_in_acl_hint     ), priority=65535, match=(1), action=(next;)
table=9 (ls_in_acl          ), priority=65535, match=(1), action=(next;)
table=10(ls_in_qos_mark     ), priority=0    , match=(1), action=(next;)
table=11(ls_in_qos_meter    ), priority=0    , match=(1), action=(next;)
table=12(ls_in_lb           ), priority=0    , match=(1), action=(next;)
table=13(ls_in_acl_after_lb ), priority=0    , match=(1), action=(next;)
table=14(ls_in_stateful     ), priority=100  , match=(reg0[1] == 1 && reg0[13] == 0), action=(ct_commit { ct_label.blocked = 0; }; next;)
table=14(ls_in_stateful     ), priority=100  , match=(reg0[1] == 1 && reg0[13] == 1), action=(ct_commit { ct_label.blocked = 0; ct_label.label = reg3; }; next;)
table=14(ls_in_stateful     ), priority=0    , match=(1), action=(next;)
table=15(ls_in_pre_hairpin  ), priority=0    , match=(1), action=(next;)
table=16(ls_in_nat_hairpin  ), priority=0    , match=(1), action=(next;)
table=17(ls_in_hairpin      ), priority=0    , match=(1), action=(next;)
table=18(ls_in_arp_rsp      ), priority=0    , match=(1), action=(next;)
table=19(ls_in_dhcp_options ), priority=0    , match=(1), action=(next;)
table=20(ls_in_dhcp_response), priority=0    , match=(1), action=(next;)
table=21(ls_in_dns_lookup   ), priority=0    , match=(1), action=(next;)
table=22(ls_in_dns_response ), priority=0    , match=(1), action=(next;)
table=23(ls_in_external_port), priority=0    , match=(1), action=(next;)
table=24(ls_in_l2_lkup      ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(handle_svc_check(inport);)
table=24(ls_in_l2_lkup      ), priority=70   , match=(eth.mcast), action=(outport = "_MC_flood"; output;)
table=24(ls_in_l2_lkup      ), priority=50   , match=(eth.dst == 00:00:00:00:00:01), action=(outport = "abc-vm1"; output;)
table=24(ls_in_l2_lkup      ), priority=50   , match=(eth.dst == 00:00:00:00:00:02), action=(outport = "abc-vm2"; output;)
table=24(ls_in_l2_lkup      ), priority=0    , match=(1), action=(outport = get_fdb(eth.dst); next;)
table=25(ls_in_l2_unknown   ), priority=50   , match=(outport == "none"), action=(drop;)
table=25(ls_in_l2_unknown   ), priority=0    , match=(1), action=(output;)
Datapath: "abc" (e7b55cb9-4ace-4fed-b1bf-798d2853a039)  Pipeline: egress
table=0 (ls_out_pre_lb      ), priority=110  , match=(eth.mcast), action=(next;)
table=0 (ls_out_pre_lb      ), priority=110  , match=(eth.src == $svc_monitor_mac), action=(next;)
table=0 (ls_out_pre_lb      ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
table=0 (ls_out_pre_lb      ), priority=0    , match=(1), action=(next;)
table=1 (ls_out_pre_acl     ), priority=110  , match=(eth.src == $svc_monitor_mac), action=(next;)
table=1 (ls_out_pre_acl     ), priority=0    , match=(1), action=(next;)
table=2 (ls_out_pre_stateful), priority=110  , match=(reg0[2] == 1), action=(ct_lb;)
table=2 (ls_out_pre_stateful), priority=100  , match=(reg0[0] == 1), action=(ct_next;)
table=2 (ls_out_pre_stateful), priority=0    , match=(1), action=(next;)
table=3 (ls_out_acl_hint    ), priority=65535, match=(1), action=(next;)
table=4 (ls_out_acl         ), priority=65535, match=(1), action=(next;)
table=5 (ls_out_qos_mark    ), priority=0    , match=(1), action=(next;)
table=6 (ls_out_qos_meter   ), priority=0    , match=(1), action=(next;)
table=7 (ls_out_stateful    ), priority=100  , match=(reg0[1] == 1 && reg0[13] == 0), action=(ct_commit { ct_label.blocked = 0; }; next;)
table=7 (ls_out_stateful    ), priority=100  , match=(reg0[1] == 1 && reg0[13] == 1), action=(ct_commit { ct_label.blocked = 0; ct_label.label = reg3; }; next;)
table=7 (ls_out_stateful    ), priority=0    , match=(1), action=(next;)
table=8 (ls_out_port_sec_ip ), priority=0    , match=(1), action=(next;)
table=9 (ls_out_port_sec_l2 ), priority=100  , match=(eth.mcast), action=(output;)
table=9 (ls_out_port_sec_l2 ), priority=50   , match=(outport == "abc-vm1" && eth.dst == {00:00:00:00:00:01}), action=(output;)
table=9 (ls_out_port_sec_l2 ), priority=50   , match=(outport == "abc-vm2"), action=(output;)

# vm1 ping vm2 成功
root@ovn-worker:/# ip netns exec vm1 ping 10.10.10.2
PING 10.10.10.2 (10.10.10.2) 56(84) bytes of data.
64 bytes from 10.10.10.2: icmp_seq=1 ttl=64 time=1.34 ms
64 bytes from 10.10.10.2: icmp_seq=2 ttl=64 time=0.116 ms
64 bytes from 10.10.10.2: icmp_seq=3 ttl=64 time=0.133 ms

# 给路由器配置静态路由指定下一跳
ovn-nbctl lr-route-add lr 20.20.0.0/24 10.10.10.3

# Logical Flow里可以看到table11里匹配目标网段将下一跳IP存入寄存器
ovn-sbctl lflow-list lr
Datapath: "lr" (9a391808-9458-4fcf-b0d7-d105991e2462)  Pipeline: ingress
table=0 (lr_in_admission    ), priority=100  , match=(vlan.present || eth.src[40]), action=(drop;)
table=0 (lr_in_admission    ), priority=50   , match=(eth.dst == 00:00:00:00:00:03 && inport == "lr-ls"), action=(xreg0[0..47] = 00:00:00:00:00:03; next;)
table=0 (lr_in_admission    ), priority=50   , match=(eth.mcast && inport == "lr-ls"), action=(xreg0[0..47] = 00:00:00:00:00:03; next;)
table=1 (lr_in_lookup_neighbor), priority=100  , match=(arp.op == 2), action=(reg9[2] = lookup_arp(inport, arp.spa, arp.sha); next;)
table=1 (lr_in_lookup_neighbor), priority=100  , match=(inport == "lr-ls" && arp.spa == 10.10.10.0/24 && arp.op == 1), action=(reg9[2] = lookup_arp(inport, arp.spa, arp.sha); next;)
table=1 (lr_in_lookup_neighbor), priority=100  , match=(nd_na), action=(reg9[2] = lookup_nd(inport, nd.target, nd.tll); next;)
table=1 (lr_in_lookup_neighbor), priority=100  , match=(nd_ns), action=(reg9[2] = lookup_nd(inport, ip6.src, nd.sll); next;)
table=1 (lr_in_lookup_neighbor), priority=0    , match=(1), action=(reg9[2] = 1; next;)
table=2 (lr_in_learn_neighbor), priority=100  , match=(reg9[2] == 1), action=(next;)
table=2 (lr_in_learn_neighbor), priority=90   , match=(arp), action=(put_arp(inport, arp.spa, arp.sha); next;)
table=2 (lr_in_learn_neighbor), priority=90   , match=(nd_na), action=(put_nd(inport, nd.target, nd.tll); next;)
table=2 (lr_in_learn_neighbor), priority=90   , match=(nd_ns), action=(put_nd(inport, ip6.src, nd.sll); next;)
table=3 (lr_in_ip_input     ), priority=100  , match=(inport == "lr-ls" && ip4 && ip.ttl == {0, 1} && !ip.later_frag), action=(icmp4 {eth.dst <-> eth.src; icmp4.type = 11; /* Time exceeded / icmp4.code = 0; / TTL exceeded in transit / ip4.dst = ip4.src; ip4.src = 10.10.10.3 ; ip.ttl = 254; outport = "lr-ls"; flags.loopback = 1; output; };)table=3 (lr_in_ip_input     ), priority=100  , match=(ip4.src == {10.10.10.3, 10.10.10.255} && reg9[0] == 0), action=(drop;)table=3 (lr_in_ip_input     ), priority=100  , match=(ip4.src_mcast ||ip4.src == 255.255.255.255 || ip4.src == 127.0.0.0/8 || ip4.dst == 127.0.0.0/8 || ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8), action=(drop;)table=3 (lr_in_ip_input     ), priority=100  , match=(ip6.dst == fe80::200:ff:fe00:3 && udp.src == 547 && udp.dst == 546), action=(reg0 = 0; handle_dhcpv6_reply;)table=3 (lr_in_ip_input     ), priority=90   , match=(inport == "lr-ls" && arp.op == 1 && arp.tpa == 10.10.10.3 && arp.spa == 10.10.10.0/24), action=(eth.dst = eth.src; eth.src = xreg0[0..47]; arp.op = 2; / ARP reply */ arp.tha = arp.sha; arp.sha = xreg0[0..47]; arp.tpa <-> arp.spa; outport = inport; flags.loopback = 1; output;)
table=3 (lr_in_ip_input     ), priority=90   , match=(inport == "lr-ls" && ip6.dst == {fe80::200:ff:fe00:3, ff02::1:ff00:3} && nd_ns && nd.target == fe80::200:ff:fe00:3), action=(nd_na_router { eth.src = xreg0[0..47]; ip6.src = nd.target; nd.tll = xreg0[0..47]; outport = inport; flags.loopback = 1; output; };)
table=3 (lr_in_ip_input     ), priority=90   , match=(ip4.dst == 10.10.10.3 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next; )
table=3 (lr_in_ip_input     ), priority=90   , match=(ip6.dst == fe80::200:ff:fe00:3 && icmp6.type == 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255; icmp6.type = 129; flags.loopback = 1; next; )
table=3 (lr_in_ip_input     ), priority=85   , match=(arp || nd), action=(drop;)
table=3 (lr_in_ip_input     ), priority=84   , match=(nd_rs || nd_ra), action=(next;)
table=3 (lr_in_ip_input     ), priority=83   , match=(ip6.mcast_rsvd), action=(drop;)
table=3 (lr_in_ip_input     ), priority=82   , match=(ip4.mcast || ip6.mcast), action=(drop;)
table=3 (lr_in_ip_input     ), priority=60   , match=(ip6.dst == {fe80::200:ff:fe00:3}), action=(drop;)
table=3 (lr_in_ip_input     ), priority=50   , match=(eth.bcast), action=(drop;)
table=3 (lr_in_ip_input     ), priority=30   , match=(ip4 && ip.ttl == {0, 1}), action=(drop;)
table=3 (lr_in_ip_input     ), priority=0    , match=(1), action=(next;)
table=4 (lr_in_unsnat       ), priority=0    , match=(1), action=(next;)
table=5 (lr_in_defrag       ), priority=0    , match=(1), action=(next;)
table=6 (lr_in_dnat         ), priority=0    , match=(1), action=(next;)
table=7 (lr_in_ecmp_stateful), priority=0    , match=(1), action=(next;)
table=8 (lr_in_nd_ra_options), priority=0    , match=(1), action=(next;)
table=9 (lr_in_nd_ra_response), priority=0    , match=(1), action=(next;)
table=10(lr_in_ip_routing_pre), priority=0    , match=(1), action=(reg7 = 0; next;)
table=11(lr_in_ip_routing   ), priority=10550, match=(nd_rs || nd_ra), action=(drop;)
table=11(lr_in_ip_routing   ), priority=194  , match=(inport == "lr-ls" && ip6.dst == fe80::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = fe80::200:ff:fe00:3; eth.src = 00:00:00:00:00:03; outport = "lr-ls"; flags.loopback = 1; next;)
table=11(lr_in_ip_routing   ), priority=74   , match=(ip4.dst == 10.10.10.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = ip4.dst; reg1 = 10.10.10.3; eth.src = 00:00:00:00:00:03; outport = "lr-ls"; flags.loopback = 1; next;)
table=11(lr_in_ip_routing   ), priority=73   , match=(reg7 == 0 && ip4.dst == 20.20.0.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 10.10.10.3; reg1 = 10.10.10.3; eth.src = 00:00:00:00:00:03; outport = "lr-ls"; flags.loopback = 1; next;)
table=12(lr_in_ip_routing_ecmp), priority=150  , match=(reg8[0..15] == 0), action=(next;)
table=13(lr_in_policy       ), priority=0    , match=(1), action=(reg8[0..15] = 0; next;)
table=14(lr_in_policy_ecmp  ), priority=150  , match=(reg8[0..15] == 0), action=(next;)
table=15(lr_in_arp_resolve  ), priority=500  , match=(ip4.mcast || ip6.mcast), action=(next;)
table=15(lr_in_arp_resolve  ), priority=0    , match=(ip4), action=(get_arp(outport, reg0); next;)
table=15(lr_in_arp_resolve  ), priority=0    , match=(ip6), action=(get_nd(outport, xxreg0); next;)
table=16(lr_in_chk_pkt_len  ), priority=0    , match=(1), action=(next;)
table=17(lr_in_larger_pkts  ), priority=0    , match=(1), action=(next;)
table=18(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
table=19(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && ip4), action=(arp { eth.dst = ff:ff:ff:ff:ff:ff; arp.spa = reg1; arp.tpa = reg0; arp.op = 1; output; };)
table=19(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && ip6), action=(nd_ns { nd.target = xxreg0; output; };)
table=19(lr_in_arp_request  ), priority=0    , match=(1), action=(output;)
Datapath: "lr" (9a391808-9458-4fcf-b0d7-d105991e2462)  Pipeline: egress
table=0 (lr_out_chk_dnat_local), priority=0    , match=(1), action=(reg9[4] = 0; next;)
table=1 (lr_out_undnat      ), priority=0    , match=(1), action=(next;)
table=2 (lr_out_post_undnat ), priority=0    , match=(1), action=(next;)
table=3 (lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
table=3 (lr_out_snat        ), priority=0    , match=(1), action=(next;)
table=4 (lr_out_post_snat   ), priority=0    , match=(1), action=(next;)
table=5 (lr_out_egr_loop    ), priority=0    , match=(1), action=(next;)
table=6 (lr_out_delivery    ), priority=100  , match=(outport == "lr-ls"), action=(output;)

# ovn nat支持 snat、dnat、dnat_and_snat三种方式,其中snat 转换后IP 转换前IP/IP段,dnat 转换前IP 转换后IP
# 在table6看到dnat转换,table2看到undnat转换
# 在table3看到snat转换,table4看到unsnat转换
ovn-nbctl lr-nat-add lr snat 116.85.0.1 10.0.0.0/24
ovn-nbctl lr-nat-add lr dnat 20.0.0.2 117.85.0.1
ovn-sbctl lflow-list lr
Datapath: "lr" (9a391808-9458-4fcf-b0d7-d105991e2462)  Pipeline: ingress
table=0 (lr_in_admission    ), priority=100  , match=(vlan.present || eth.src[40]), action=(drop;)
table=0 (lr_in_admission    ), priority=50   , match=(eth.dst == 00:00:00:00:00:03 && inport == "lr-ls"), action=(xreg0[0..47] = 00:00:00:00:00:03; next;)
table=0 (lr_in_admission    ), priority=50   , match=(eth.mcast && inport == "lr-ls"), action=(xreg0[0..47] = 00:00:00:00:00:03; next;)
table=1 (lr_in_lookup_neighbor), priority=100  , match=(arp.op == 2), action=(reg9[2] = lookup_arp(inport, arp.spa, arp.sha); next;)
table=1 (lr_in_lookup_neighbor), priority=100  , match=(inport == "lr-ls" && arp.spa == 10.10.10.0/24 && arp.op == 1), action=(reg9[2] = lookup_arp(inport, arp.spa, arp.sha); next;)
table=1 (lr_in_lookup_neighbor), priority=100  , match=(nd_na), action=(reg9[2] = lookup_nd(inport, nd.target, nd.tll); next;)
table=1 (lr_in_lookup_neighbor), priority=100  , match=(nd_ns), action=(reg9[2] = lookup_nd(inport, ip6.src, nd.sll); next;)
table=1 (lr_in_lookup_neighbor), priority=0    , match=(1), action=(reg9[2] = 1; next;)
table=2 (lr_in_learn_neighbor), priority=100  , match=(reg9[2] == 1), action=(next;)
table=2 (lr_in_learn_neighbor), priority=90   , match=(arp), action=(put_arp(inport, arp.spa, arp.sha); next;)
table=2 (lr_in_learn_neighbor), priority=90   , match=(nd_na), action=(put_nd(inport, nd.target, nd.tll); next;)
table=2 (lr_in_learn_neighbor), priority=90   , match=(nd_ns), action=(put_nd(inport, ip6.src, nd.sll); next;)
table=3 (lr_in_ip_input     ), priority=100  , match=(inport == "lr-ls" && ip4 && ip.ttl == {0, 1} && !ip.later_frag), action=(icmp4 {eth.dst <-> eth.src; icmp4.type = 11; /* Time exceeded / icmp4.code = 0; / TTL exceeded in transit / ip4.dst = ip4.src; ip4.src = 10.10.10.3 ; ip.ttl = 254; outport = "lr-ls"; flags.loopback = 1; output; };)table=3 (lr_in_ip_input     ), priority=100  , match=(ip4.src == {10.10.10.3, 10.10.10.255} && reg9[0] == 0), action=(drop;)table=3 (lr_in_ip_input     ), priority=100  , match=(ip4.src_mcast ||ip4.src == 255.255.255.255 || ip4.src == 127.0.0.0/8 || ip4.dst == 127.0.0.0/8 || ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8), action=(drop;)table=3 (lr_in_ip_input     ), priority=100  , match=(ip6.dst == fe80::200:ff:fe00:3 && udp.src == 547 && udp.dst == 546), action=(reg0 = 0; handle_dhcpv6_reply;)table=3 (lr_in_ip_input     ), priority=90   , match=(arp.op == 1 && arp.tpa == 116.85.0.1), action=(eth.dst = eth.src; eth.src = xreg0[0..47]; arp.op = 2; / ARP reply / arp.tha = arp.sha; arp.sha = xreg0[0..47]; arp.tpa <-> arp.spa; outport = inport; flags.loopback = 1; output;)table=3 (lr_in_ip_input     ), priority=90   , match=(arp.op == 1 && arp.tpa == 20.0.0.2), action=(eth.dst = eth.src; eth.src = xreg0[0..47]; arp.op = 2; / ARP reply / arp.tha = arp.sha; arp.sha = xreg0[0..47]; arp.tpa <-> arp.spa; outport = inport; flags.loopback = 1; output;)table=3 (lr_in_ip_input     ), priority=90   , match=(inport == "lr-ls" && arp.op == 1 && arp.tpa == 10.10.10.3 && arp.spa == 10.10.10.0/24), action=(eth.dst = eth.src; eth.src = xreg0[0..47]; arp.op = 2; / ARP reply */ arp.tha = arp.sha; arp.sha = xreg0[0..47]; arp.tpa <-> arp.spa; outport = inport; flags.loopback = 1; output;)
table=3 (lr_in_ip_input     ), priority=90   , match=(inport == "lr-ls" && ip6.dst == {fe80::200:ff:fe00:3, ff02::1:ff00:3} && nd_ns && nd.target == fe80::200:ff:fe00:3), action=(nd_na_router { eth.src = xreg0[0..47]; ip6.src = nd.target; nd.tll = xreg0[0..47]; outport = inport; flags.loopback = 1; output; };)
table=3 (lr_in_ip_input     ), priority=90   , match=(ip4.dst == 10.10.10.3 && icmp4.type == 8 && icmp4.code == 0), action=(ip4.dst <-> ip4.src; ip.ttl = 255; icmp4.type = 0; flags.loopback = 1; next; )
table=3 (lr_in_ip_input     ), priority=90   , match=(ip6.dst == fe80::200:ff:fe00:3 && icmp6.type == 128 && icmp6.code == 0), action=(ip6.dst <-> ip6.src; ip.ttl = 255; icmp6.type = 129; flags.loopback = 1; next; )
table=3 (lr_in_ip_input     ), priority=85   , match=(arp || nd), action=(drop;)
table=3 (lr_in_ip_input     ), priority=84   , match=(nd_rs || nd_ra), action=(next;)
table=3 (lr_in_ip_input     ), priority=83   , match=(ip6.mcast_rsvd), action=(drop;)
table=3 (lr_in_ip_input     ), priority=82   , match=(ip4.mcast || ip6.mcast), action=(drop;)
table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst == {10.10.10.3}), action=(drop;)
table=3 (lr_in_ip_input     ), priority=60   , match=(ip6.dst == {fe80::200:ff:fe00:3}), action=(drop;)
table=3 (lr_in_ip_input     ), priority=50   , match=(eth.bcast), action=(drop;)
table=3 (lr_in_ip_input     ), priority=30   , match=(ip4 && ip.ttl == {0, 1}), action=(drop;)
table=3 (lr_in_ip_input     ), priority=0    , match=(1), action=(next;)
table=4 (lr_in_unsnat       ), priority=90   , match=(ip && ip4.dst == 116.85.0.1), action=(ct_snat;)
table=4 (lr_in_unsnat       ), priority=0    , match=(1), action=(next;)
table=5 (lr_in_defrag       ), priority=0    , match=(1), action=(next;)
table=6 (lr_in_dnat         ), priority=100  , match=(ip && ip4.dst == 20.0.0.2), action=(flags.loopback = 1; ct_dnat(117.85.0.1);)
table=6 (lr_in_dnat         ), priority=0    , match=(1), action=(next;)
table=7 (lr_in_ecmp_stateful), priority=0    , match=(1), action=(next;)
table=8 (lr_in_nd_ra_options), priority=0    , match=(1), action=(next;)
table=9 (lr_in_nd_ra_response), priority=0    , match=(1), action=(next;)
table=10(lr_in_ip_routing_pre), priority=0    , match=(1), action=(reg7 = 0; next;)
table=11(lr_in_ip_routing   ), priority=10550, match=(nd_rs || nd_ra), action=(drop;)
table=11(lr_in_ip_routing   ), priority=194  , match=(inport == "lr-ls" && ip6.dst == fe80::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = fe80::200:ff:fe00:3; eth.src = 00:00:00:00:00:03; outport = "lr-ls"; flags.loopback = 1; next;)
table=11(lr_in_ip_routing   ), priority=74   , match=(ip4.dst == 10.10.10.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = ip4.dst; reg1 = 10.10.10.3; eth.src = 00:00:00:00:00:03; outport = "lr-ls"; flags.loopback = 1; next;)
table=11(lr_in_ip_routing   ), priority=73   , match=(reg7 == 0 && ip4.dst == 20.20.0.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 10.10.10.3; reg1 = 10.10.10.3; eth.src = 00:00:00:00:00:03; outport = "lr-ls"; flags.loopback = 1; next;)
table=12(lr_in_ip_routing_ecmp), priority=150  , match=(reg8[0..15] == 0), action=(next;)
table=13(lr_in_policy       ), priority=0    , match=(1), action=(reg8[0..15] = 0; next;)
table=14(lr_in_policy_ecmp  ), priority=150  , match=(reg8[0..15] == 0), action=(next;)
table=15(lr_in_arp_resolve  ), priority=500  , match=(ip4.mcast || ip6.mcast), action=(next;)
table=15(lr_in_arp_resolve  ), priority=0    , match=(ip4), action=(get_arp(outport, reg0); next;)
table=15(lr_in_arp_resolve  ), priority=0    , match=(ip6), action=(get_nd(outport, xxreg0); next;)
table=16(lr_in_chk_pkt_len  ), priority=0    , match=(1), action=(next;)
table=17(lr_in_larger_pkts  ), priority=0    , match=(1), action=(next;)
table=18(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
table=19(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && ip4), action=(arp { eth.dst = ff:ff:ff:ff:ff:ff; arp.spa = reg1; arp.tpa = reg0; arp.op = 1; output; };)
table=19(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && ip6), action=(nd_ns { nd.target = xxreg0; output; };)
table=19(lr_in_arp_request  ), priority=0    , match=(1), action=(output;)
Datapath: "lr" (9a391808-9458-4fcf-b0d7-d105991e2462)  Pipeline: egress
table=0 (lr_out_chk_dnat_local), priority=0    , match=(1), action=(reg9[4] = 0; next;)
table=1 (lr_out_undnat      ), priority=50   , match=(ip), action=(flags.loopback = 1; ct_dnat;)
table=1 (lr_out_undnat      ), priority=0    , match=(1), action=(next;)
table=2 (lr_out_post_undnat ), priority=50   , match=(ip && ct.new), action=(ct_commit { } ; next; )
table=2 (lr_out_post_undnat ), priority=0    , match=(1), action=(next;)
table=3 (lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
table=3 (lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24), action=(ct_snat(116.85.0.1);)
table=3 (lr_out_snat        ), priority=0    , match=(1), action=(next;)
table=4 (lr_out_post_snat   ), priority=0    , match=(1), action=(next;)
table=5 (lr_out_egr_loop    ), priority=0    , match=(1), action=(next;)
table=6 (lr_out_delivery    ), priority=100  , match=(outport == "lr-ls"), action=(output;)
<think>我们正在讨论OVN(Open Virtual Network)的组件及其功能说明。根据提供的引用,OVN是一个基于Open vSwitch(OVS)的网络虚拟化解决方案,用于构建软件定义网络(SDN)。它提供了一套分布式的机制来管理控制虚拟网络。以下将详细介绍OVN的主要组件及其功能。 ### OVN 核心组件介绍 #### 1. **OVN 北向数据库 (OVN Northbound Database, OVN_NB)** - **功能**:存储用户定义的网络配置(逻辑网络拓扑),包括逻辑交换机、逻辑路由器、ACL规则、负载均衡器等[^2]。 - **接口**:通过`ovn-nbctl`命令行工具或REST API进行配置。 - **示例配置**: ```bash # 创建逻辑交换机 ovn-nbctl ls-add ls1 # 添加负载均衡器到逻辑交换机 ovn-nbctl lb-add lb1 192.168.1.100:80 10.0.0.1:80,10.0.0.2:80 tcp ovn-nbctl ls-lb-add ls1 lb1 ``` #### 2. **OVN 南向数据库 (OVN Southbound Database, OVN_SB)** - **功能**:存储由OVN中央组件转换后的物理网络映射信息,包括虚拟机端口绑定、隧道端点、路由表等[^2]。 - **作用**:作为控制平面(如`ovn-northd`)与数据平面(如`ovn-controller`)之间的桥梁。 #### 3. **中央组件 (ovn-northd)** - **功能**:将北向数据库中的逻辑网络配置转换为南向数据库中的物理网络流表[^2]。 - **工作方式**:实时监控OVN_NB的变化,并生成对应的OVN_SB记录。 #### 4. **本地控制器 (ovn-controller)** - **部署位置**:每台运行OVS的主机(计算节点或网关节点)。 - **功能**: - 从OVN_SB获取本机相关的网络配置。 - 将配置转换为OVS流表并下发到本地OVS实例[^2]。 - 管理隧道端点(如VXLAN、Geneve)本地虚拟机端口。 #### 5. **OVN 网关 (OVN Gateway)** - **类型**: - **分布式网关**:默认模式,所有计算节点均可处理南北向流量。 - **集中式网关**:通过专用节点处理外部网络流量。 - **配置示例**: ```bash # 设置集中式网关 ovn-nbctl set logical_router lr1 options:chassis=gw1 ``` ### 关键功能说明 #### 逻辑网络抽象 - **逻辑交换机 (Logical Switch)**:提供二层虚拟网络,支持MAC学习、广播域隔离。 - **逻辑路由器 (Logical Router)**:提供三层路由功能,支持静态路由、NAT、负载均衡[^3]。 #### 网络服务 1. **负载均衡 (Load Balancing)** - 支持四层(TCP/UDP)负载均衡,可配置VIP后端服务器池[^3]。 - 示例: ```bash ovn-nbctl lb-add lb2 10.10.10.22:8000 "10.10.10.2:8000,10.10.10.3:8000" tcp ``` 2. **访问控制列表 (ACL)** - 基于流量的五元组(源/目的IP、端口、协议)设置允许/拒绝规则。 - 示例: ```bash ovn-nbctl acl-add ls1 from-lport 1002 "ip4.src==10.0.0.0/24" allow ``` 3. **NAT 与 DHCP** - 提供SNAT(源地址转换)、DNAT(目的地址转换)功能。 - 通过OVN内置的DHCP服务为虚拟机分配IP地址。 ### 组件交互流程 ```mermaid sequenceDiagram participant User as 管理员 participant NB as OVN_NB participant Northd as ovn-northd participant SB as OVN_SB participant Controller as ovn-controller participant OVS as Open vSwitch User->>NB: 配置逻辑网络 (ovn-nbctl) NB->>Northd: 通知配置变更 Northd->>SB: 生成物理映射 loop 主机同步 SB->>Controller: 推送本机配置 Controller->>OVS: 下发流表 end ``` ### 典型部署架构 ```mermaid graph TD subgraph 控制节点 NB(OVN_NB数据库) SB(OVN_SB数据库) Northd(ovn-northd进程) end subgraph 计算节点1 Controller1(ovn-controller) OVS1(Open vSwitch) VM1[虚拟机] end subgraph 计算节点2 Controller2(ovn-controller) OVS2(Open vSwitch) VM2[虚拟机] end NB --> Northd Northd --> SB SB --> Controller1 SB --> Controller2 Controller1 --> OVS1 Controller2 --> OVS2 OVS1 --> VM1 OVS2 --> VM2 ``` ### 性能优化建议 - **流表压缩**:通过`ovn-northd`的`lflow-cache`减少南向数据库记录数量。 - **网关分流**:对东西向流量使用分布式网关,南北向流量使用集中式网关。 --- ### 相关问题 1. OVN 如何实现跨主机的虚拟机通信? 2. 如何配置 OVN 的分布式 NAT 功能? 3. OVN 中的 ACL 规则与 OpenFlow 规则有何区别? 4. OVN 的高可用性方案有哪些? 5. 如何监控 OVN 网络流量性能?[^1][^2]</think>### OVN 核心组件及功能详解 #### 一、OVN 架构概述 OVN(Open Virtual Network)是基于 Open vSwitch(OVS)的分布式 SDN 解决方案,其架构采用**控制平面与数据平面分离**的设计: ```mermaid graph TD A[管理平面] --> |配置下发| B(OVN Northbound DB) B --> C[ovn-northd] C --> D(OVN Southbound DB) D --> E[ovn-controller] E --> F[Open vSwitch] ``` 核心特性包括: - 逻辑网络抽象(交换机/路由器) - 分布式网络服务(LB/ACL/DHCP) - 物理网络无关性 - 多租户隔离支持[^2] #### 二、核心组件功能说明 1. **OVN Northbound 数据库 (OVN_NB)** - **功能**:存储管理员定义的**逻辑网络配置** - 管理对象: - 逻辑交换机 (Logical Switches) - 逻辑路由器 (Logical Routers) - 负载均衡器 (Load Balancers) - 访问控制列表 (ACLs) - **操作工具**:`ovn-nbctl` ```bash # 示例:创建负载均衡器[^3] ovn-nbctl lb-add lb1 192.168.1.100:80 10.0.0.1:80,10.0.0.2:80 tcp ``` 2. **ovn-northd 服务** - **核心作用**:逻辑网络 ➔ 物理网络的**翻译器** - 工作流程: 1. 监控 OVN_NB 配置变更 2. 生成物理网络流表规则 3. 写入 OVN Southbound 数据库 - **关键转换**:逻辑端口 ➔ 物理端口绑定 3. **OVN Southbound 数据库 (OVN_SB)** - **功能**:存储**物理网络状态**信息 - 关键数据: - 主机隧道端点 (Chassis) - 端口绑定关系 (Port_Binding) - 数据路径流表 (Datapath_Binding) - **特点**:分布式强一致性数据库 4. **ovn-controller 代理** - **部署位置**:每台计算/网络节点 - **核心功能**: - 从 OVN_SB 获取本机流表规则 - 将规则编译为 OpenFlow 指令 - 下发到本地 OVS 实例 - 报告物理端口状态 - **关键进程**:`ovn-controller` #### 三、网络功能实现机制 1. **负载均衡 (Load Balancing)** - 支持四层(TCP/UDP)负载分发 - **配置示例**[^3]: ```bash ovn-nbctl lb-add lb2 10.10.10.22:8000 "10.10.10.2:8000,10.10.10.3:8000" tcp ovn-nbctl ls-lb-add ls2 lb2 # 绑定到逻辑交换机 ``` - **工作原理**:DNAT + 连接跟踪 2. **逻辑路由器 (Logical Router)** - 实现功能: - 跨子网路由 - 静态路由分发 - SNAT/DNAT - **分布式特性**:路由计算在所有节点执行 3. **安全组 (ACLs)** - 基于五元组的流量控制 - 优先级匹配机制: ```bash ovn-nbctl acl-add ls1 from-lport 1002 "ip4.src==10.0.0.0/24" allow ``` #### 四、典型工作流程 ```mermaid sequenceDiagram participant Admin as 管理员 participant NB as OVN_NB participant Northd as ovn-northd participant SB as OVN_SB participant Controller as ovn-controller participant VM as 虚拟机 Admin->>NB: 创建逻辑网络(ovn-nbctl) NB->>Northd: 配置变更通知 Northd->>SB: 生成物理流表 loop 主机同步 SB->>Controller: 推送本机流表 Controller->>VM: 配置OVS端口 end VM->>VM: 跨主机通信 ``` #### 五、部署注意事项 1. **版本兼容性**: - OVN 从 OVS v2.13.0 开始独立发布[^1] - 需匹配 OVS 版本(建议 ≥2.17) 2. **高可用方案**: - 数据库集群(Raft 协议) - 多 northd 主备部署 3. **性能优化**: - 流表缓存 (lflow-cache) - 并发连接处理 --- ### 相关问题 1. OVN 如何实现虚拟机迁移时的网络连续性? 2. 在 OVN 中如何配置跨数据中心的网络连接? 3. OVN 与 OpenStack Neutron 的集成机制是什么? 4. 如何监控 OVN 网络流量性能指标? 5. OVN 中的分布式 NAT 与传统集中式 NAT 有何优劣?[^1][^2]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值