Flannel analysis of kubernetes mainstream network scheme

I Flannel brief description

Flannel is one of the CNI network plug-ins of kubernetes cluster. It is essentially an overlay network. Flannel supports a variety of network forwarding strategies, such as vxlan and hostgw.

II Characteristics of Flannel network

  • Make Docker containers created by different Node hosts in the cluster have unique virtual IP addresses in the whole cluster.
  • This is an overlay network, through which data packets are transmitted to the target container intact. The overlay network is a virtual network built on another network and supported by its infrastructure. Overlay network separates the network service from the underlying infrastructure by encapsulating a packet in a bright packet. After the encapsulated data packet is forwarded to the endpoint, it is unpacked.
  • Create a new virtual network card flannel0 to receive the data of docker bridge. Packet and forward (vxlan) the received data by maintaining the routing table.
  • Etcd ensures that the configuration of flanneled on all nodes is consistent. At the same time, flanneld on each node monitors the data changes on etcd and senses the changes of nodes in the cluster in real time.

III Explanation of each component

  • cni0
    Bridge equipment: each time a Pod is created, a pair of Veth pairs will be created, one end of which is eth0 in the Pod and the other end is the port (network card) in the cni0 bridge. The traffic sent by Pod from network card eth0 will be sent to the port (network card) of cni0 bridge device.

  • The IP address obtained by cni0 device is the first address of the network to which the node is assigned

  • flannel.1
    The equipment of overlay network is used to process vxlan messages (packet and unpacking). The Pod data traffic between different node s is sent from the overlay device to the opposite end in the form of tunnel.

  • flanneld
    Flanneld runs flanneld as an agent in each host. It will obtain a small network segment subnet from the network address space of the cluster for all hosts, and the IP addresses of all containers in the host will be allocated from it. At the same time, flanneld monitors k8s cluster database, which is flannel 1. The device provides necessary mac, IP and other network data information when encapsulating data.

IV Pod communication flow on different nodes

1) The data is generated in the Pod and sent to cni0 according to the routing information of the Pod
2) cni0 sends the data to the tunnel device flannel according to the routing table of the node one
3)flannel.1. Check the destination IP of the data packet, obtain the necessary information of the opposite end tunnel equipment from flanneld, and package the data packet.
4)flannel.1 send the data packet to the opposite end device. The network card of the opposite node receives the data packet and finds that the data packet is an overlay data packet. Unpack the outer layer and send the inner layer to the flannel 1 equipment.
5)flannel.1 device checks the data packet, matches according to the routing table, and sends the data to cni0 device.
6) cni0 matches the routing table and sends data to the corresponding port on the bridge.

The flannel network (POD CIDR) defined by the test cluster kubernetes is The following example is used to explain the communication between different pods in the network: - pod1 route
#kubectl -n stack exec -it api-0 -- bash
#ip route show
default via dev eth0 dev eth0 proto kernel scope link src via dev eth0 - pod2 route
#kubectl -n stack exec -it redis-64c6c549ff-5plcq -- bash
#ip route show
default via dev eth0 via dev eth0 dev eth0 proto kernel scope link src

It can be seen that the default POD network card gateway is the. 1 Gateway, and the gateway is the IP of cni0. Next, analyze the direction of traffic after reaching the host~ Host routing
#ip route -n
default via dev eth0 dev eth0 proto kernel scope link src dev eth1 proto kernel scope link src dev docker0 proto kernel scope link src dev cni0 proto kernel scope link src via dev flannel.1 onlink via dev flannel.1 onlink Host routing
#ip route -n
default via dev eth0 dev eth0 proto kernel scope link src dev eth1 proto kernel scope link src dev docker0 proto kernel scope link src via dev flannel.1 onlink dev cni0 proto kernel scope link src via dev flannel.1 onlink

It can be seen from the above routing that according to the minimum matching principle, it is matched to the above routing table item. Packets from to network segment are sent to gateway, and the gateway device is flannel one

flannel.1 is a vxlan device, when the data packet comes to the flannel 1, the data package needs to be encapsulated. At this time, the dst ip is and the src ip is The mac address corresponding to the ip address needs to be known for packet encapsulation. At this point, flannel 1. Instead of sending an arp request to obtain the mac address of, the Linux kernel sends a "L3 Miss" event request to the user space flanned program. After receiving the request event from the kernel, the flanned program looks for the flannel of the subnet that can match the address from etcd 1. The mac address of the device, that is, the flannel in the host where the pod is sent 1 mac address of the device. Flannel records all network segments and mac information when assigning ip network segments to nodes, so it can know.

#ip neigh |grep 172 dev flannel.1 lladdr 82:c4:0e:f2:00:6f PERMANENT dev flannel.1 lladdr 42:6e:8b:9b:e2:73 PERMANENT

Here, the inner data packet of vxlan is encapsulated. The format is as follows:

The forwarding process of VXLAN mainly depends on the implementation of FDB(Forwarding Database). VXLAN equipment finds the corresponding VTEP IP address according to the MAC address, and then encapsulates and sends the layer-2 data frame to the corresponding VTEP.

#/sbin/bridge fdb show dev flannel.1
42:6e:8b:9b:e2:73 dst self permanent
ba:8b:ce:f3:b8:51 dst self permanent
42:6f:c7:06:3e:a0 dst self permanent
82:c4:0e:f2:00:6f dst self permanent

The kernel needs to check the fdb(forwarding database) on the node to obtain the node address of the destination vtep device in the inner layer packet. Because the mac address of the destination device has been found to be 42:6e:8b:9b:e2:73 from the arp table, and the IP address of the node node corresponding to the mac address exists in the fdb. If there is no such information in the fdb, the kernel will launch an "L2 MISS" event to the flanned program in user space. After receiving this event, flanneld will query etcd, obtain the "Public IP" of the node corresponding to the vtep device, and register the information in fdb.
When the kernel checks that fdb obtains the ip address sent to the machine, arp obtains the mac address, and then the outer packaging of vxlan can be completed.

Specific analysis can be done through wireshark packet capture: when the eth0 network card of the node receives the vxlan device package, kernal will recognize that it is a vxlan package, disassemble the package and transfer it to the flannel on the node 1 equipment. In this way, the packet will arrive at the destination node from the sending node, flannel 1. The device will receive a packet as follows:

The destination address is and arrives at flannel 1. Find your own routing table and complete forwarding according to the routing table. As can be seen from the figure below, flannel 1 forward the traffic to to cni0.

Check the cni0 bridge information. The cni0 network realizes communication through veth by binding the network card of pod and the network card of host computer:

#brctl show
bridge name     bridge id               STP enabled     interfaces
cni0            8000.a656432b14cf       no              veth1f7db117
docker0         8000.024216a031b6       no

It can be seen from the figure below that the POD network card of corresponds to link netnsid 0

It can be seen from the figure below that the veth of the POD network card on the host computer is vethf4995a29

Therefore, the veh pair of the pod mounted on the cni0 bridge is vehf4995a29, eth0@if21 and vethf4995a29@if3 A pair of veth and pair. So as to inject traffic into the eth0 network card of pod.

This article is reproduced from: https://mp.weixin.qq.com/s/68QMlmGVJTZO5nkrpc-uMg

Tags: Kubernetes

Posted by suneel on Thu, 05 May 2022 07:24:46 +0300