容器編排系統K8s之flannel網路模型

1874發表於2021-01-03

  前文我們聊到了k8s上webui的安裝和相關使用者授權,回顧請參考:https://www.cnblogs.com/qiuhom-1874/p/14222930.html;今天我們來聊一聊k8s上的網路相關話題;

  在說k8s網路之前,我們先來回顧下docker中的網路是怎麼實現的,在docker中,容器有4種型別,第一種是closed container型別,這種容器型別,容器內部只有一個lo介面,它無法實現和外部網路通訊;第二種是bridged container,這種型別容器就是預設的容器型別,它是通過橋接的形式將容器中的虛擬網路卡直接橋接到docker0橋上,讓其容器內部的虛擬網路卡和docker0橋直接位於同一網路名稱空間中,使得容器可以同外部網路通訊;第三種容器就是joined container,這種容器是共享式網路容器,所謂共享式網路容器是指在執行一個容器時,直接指定我們要把該容器同那個容器共享為同一網路名稱空間,這種網路模型本質上也橋接的一種,不同於bridage contaier的是,joined container它可以共享多個容器的網路名稱空間;比如容器a要和容器b通訊,容器a就可以直接加入到容器b的網路名稱空間中,實現兩個容器在同一網路名稱空間中;第四種容器是open container,這種容器網路是直接共享宿主機網路名稱空間;如下圖所示

  提示:從上述描述不難發現docker中的容器網路模型都是通過橋接的模式實現的不同類別的網路模型;

  在docker中跨主機容器通訊是怎麼做的呢?

  從上面的描述,docker中的容器要想和外部通訊,首先要把對應的容器橋接到能夠和外部通訊的橋上,預設情況docker執行的容器,它會把容器內部的虛擬網路卡橋接到docker0橋,對應docker0橋是宿主機上的一個虛擬網橋,它是一個nat橋,它能夠和外部網路通訊的原因是它藉助了宿主機上的iptables規則中的SNAT實現的源地址轉換,從而實現和外部網路主機通訊;同樣的道理如果對應docker0橋上橋接的容器要能夠被外部網路所訪問,它也需要藉助宿主機上的iptables中的DNAT,讓其外部網路訪問對應宿主機上的ip地址,對應流量通過DNAT將使用者請求送達至容器內部進行響應;如下圖所示

  提示:同一宿主機上的容器通訊直接可以通過docker0橋直接通訊,跨主機容器間通訊,必須藉助宿主機上的iptables規則,將docker0橋上橋接的容器通過SNAT或DNAT把對應請求路由出去或將外部請求轉發到對應容器內部進行響應;

  k8s上的網路

  我們知道在k8s上有三種網路,第一種是宿主機網路,這種網路沒有列入容器編排的範疇內,是叢集管理員自行維護;第二種網路上service網路,service網路也叫cluster網路,該網路本質上不會在任何網路卡上存在,它是藉助每個節點上的kube-proxy生成的iptables或ipvs規則,主要用來實現對pod訪問的負載均衡,也是各服務間訪問的網路;第三種網路就是pod網路,pod網路主要用來pod和pod間通訊;如下圖所示

  提示:在k8s上pod和pod通訊,它不是靠iptables中的SANT或DNAT實現的,它也不走docker0橋,而是藉助外部網路外掛實現的,對於k8s的網路外掛來說,實現的軟體有很多,最為著名的有flannel或calico這兩種;這兩種外掛都能實現pod與pod間通訊不依賴iptables中的SNAT或DANT;不同的是flannel不支援網路策略,對應calico支援網路策略;

  flannel是怎麼實現的pod與pod間通訊的呢?

  我們知道在k8s上pod的ip地址,取決於我們使用的網路外掛,使用flannel網路外掛我們在初始化叢集時就要指定對應的pod網路(10.244.0.0/16),如果使用calico網路外掛,初始化叢集我們要指定pod網路為192.168.0.0/16;我們指定對應的pod網路地址,使用對應的外掛,k8s叢集就能正常工作,這其中的原因是預設flannel網路外掛使用的地址就是10.244.0.0/16的地址,calico使用的192.168.0.0/16;當然這個預設的配置我們是可以更改的;以flannel為例,它是怎麼實現pod和pod直接通訊的呢?我們知道在docker環境中跨節點通訊,兩個容器的地址可能是相同的地址,為此跨節點容器通訊就必須藉助SNAT或DNAT方式進行通訊;對於在k8s上網路外掛要想實現pod和pod直接通訊,首先要解決podip和podip不能互相沖突;對於flannel這個網路外掛來說,它解決pod地址衝突是依賴網路虛擬化中的vxlan機制實現的;vxlan能夠將10.244.0.0/16這個網路劃分為多個子網,每個節點的pod使用對應節點上的子網地址,這樣一來不同節點上的podip就一定不會發生兩個podip地址相同的情況;比如vxlan把10.244.0.0/16的網路劃分為256個子網,第一個節點上執行的pod就是用10.244.0.0/24這個子網中的地址,第二個節點上的pod就使用10.244.1.0/24子網中的地址;第三個節點,第四個節點依次類推;IP地址衝突問題解決,pod和pod怎麼直接通訊呢?方案一:按照docker網路中的思想,我們可以將容器內部的虛擬網路卡直接橋接到宿主機上的網路卡上,這樣一來對應每個pod就可以通過宿主機網路卡來實現通訊;但是這種方式有一個缺點,如果對應pod增多,對於宿主機網路中的arp廣播報文可能因為數量多而導致arp泛洪;從而導致網路擁塞而不可用;為此我們需要藉助其他機制來解決;比如把每個節點的子網劃分一個vlan,節點與節點通訊,通過vlan去交換資料;這樣一來我們就需要手動去管理vlan,很顯然這種方式不是我們想要的方式;方式二:我們不把容器的虛擬網路卡橋接到宿主機網路卡上,而是把它橋接的一個虛擬的網橋上;然後把宿主機和宿主機通過某種機制打通一個隧道,然後生成對應的路由資訊,這樣一來在同一節點上的pod通訊直接通過虛擬網橋通訊即可;如果要和其他節點上的pod通訊對應報文會通路由資訊把對應虛擬網橋上的流量傳送到隧道介面,進行隧道協議報文的封裝,然後把封裝好的報文通過自己所在節點上的物理網路卡傳送出去,對應主機收到此報文後,通過層層解封裝,最後到達對應pod內部,從而實現pod和pod直接通訊;對此pod是無所感知的,因為最終到達pod的報文一定是源ip和目標ip都是對應的podip;那麼問題來了,對應介面怎麼知道是對端虛擬網橋的ip地址呢?它怎麼知道對應報文該發往那個主機呢?這個就跟我們使用的網路外掛有關係了;在flannel網路外掛中,對應的網路資訊是儲存在一個儲存系統中的,比如使用etcd儲存;在k8s上安裝好flannel外掛以後,對應的它會在每個宿主機上執行一個守護程式,並且在每個節點上建立一個cni0的介面,這個介面就是我們上面說的虛擬網橋;除了這個網橋,它還會建立一個flannel.1的介面,這個介面就是隧道介面;隨後flannel會藉助vxlan把10.244.0.0/16這個網路進行子網劃分,並把對應劃分的子網資訊同對應節點上的物理網路卡上的ip地址,mac地址,進行一一對應;比如節點1的子網為10.244.0.0/24,對應物理網路卡上的ip地址為192.168.0.41,mac地址xxx;把節點2的子網資訊以及對應節點ip地址,mac地址等資訊一一對應起來,並把這些資訊儲存在etcd中;當節點1上的pod需要和節點2上的pod通訊時,此時vxlan控制器會到etcd中檢索對應的資訊,然後封裝報文;其實說這麼多就一句話,在k8s上flannel網路外掛是通過vxlan機制,實現對每個節點上pod網路進行子網劃分解決了podip地址衝突問題,同時也基於vxlan機制實現節點與節點間的隧道通訊;從而實現k8s上的pod與pod間可以直接通訊;如下圖所示

  k8s上藉助flannel網路外掛,實現跨節點pod通訊報文走向示意圖

 

  提示:簡單講vxlan就是藉助物理網路卡來承載pod網路,而實現的二層隧道;從而各pod可以直接使用該隧道通訊,中間無需做任何nat轉換;其實vxlan這種技術運用還是很廣泛的,比如openstack中的自服務網路,docker swarm中的容器和容器間通訊;

  測試:在k8s上執行pod,然後在pod內部ping跨節點pod,看看他們之間具體是怎麼通訊過程

[root@master01 ~]# kubectl get pods -o wide
NAME                         READY   STATUS    RESTARTS   AGE   IP           NODE             NOMINATED NODE   READINESS GATES
myapp-dep-5bc4d8cc74-2kjf5   1/1     Running   0          20h   10.244.2.9   node02.k8s.org   <none>           <none>
myapp-dep-5bc4d8cc74-5k8rc   1/1     Running   0          20h   10.244.1.3   node01.k8s.org   <none>           <none>
myapp-dep-5bc4d8cc74-w6gdz   1/1     Running   0          20h   10.244.3.9   node03.k8s.org   <none>           <none>
[root@master01 ~]# 

  提示:以上k8s上在default名稱空間中跑了3個pod,分別被排程到3個節點之上,各自執行了一個pod;

  進入其中一個pod,在其內部ping另外一個pod

[root@master01 ~]# kubectl get pods -o wide                                
NAME                         READY   STATUS    RESTARTS   AGE   IP           NODE             NOMINATED NODE   READINESS GATES
myapp-dep-5bc4d8cc74-2kjf5   1/1     Running   0          20h   10.244.2.9   node02.k8s.org   <none>           <none>
myapp-dep-5bc4d8cc74-5k8rc   1/1     Running   0          20h   10.244.1.3   node01.k8s.org   <none>           <none>
myapp-dep-5bc4d8cc74-w6gdz   1/1     Running   0          20h   10.244.3.9   node03.k8s.org   <none>           <none>
[root@master01 ~]# kubectl exec -it myapp-dep-5bc4d8cc74-2kjf5 -- /bin/sh
/ # ip a 
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: eth0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP 
    link/ether 5a:2a:ca:ec:83:65 brd ff:ff:ff:ff:ff:ff
    inet 10.244.2.9/24 brd 10.244.2.255 scope global eth0
       valid_lft forever preferred_lft forever
/ # ping 10.244.1.3
PING 10.244.1.3 (10.244.1.3): 56 data bytes
64 bytes from 10.244.1.3: seq=0 ttl=62 time=9.944 ms
64 bytes from 10.244.1.3: seq=1 ttl=62 time=1.974 ms
64 bytes from 10.244.1.3: seq=2 ttl=62 time=2.115 ms

  登入到node01節點,檢視網路介面

[root@node01 ~]# ip a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:0c:29:01:21:41 brd ff:ff:ff:ff:ff:ff
    inet 172.16.11.4/24 brd 172.16.11.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe01:2141/64 scope link 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN 
    link/ether 02:42:e1:a6:d7:1a brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN 
    link/ether 76:12:1a:11:62:86 brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.0/32 brd 10.244.1.0 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::7412:1aff:fe11:6286/64 scope link 
       valid_lft forever preferred_lft forever
5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP qlen 1000
    link/ether 52:6f:30:31:77:86 brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.1/24 brd 10.244.1.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::506f:30ff:fe31:7786/64 scope link 
       valid_lft forever preferred_lft forever
7: vethce8e4bf2@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP 
    link/ether 9a:22:8e:d7:78:33 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::9822:8eff:fed7:7833/64 scope link 
       valid_lft forever preferred_lft forever
[root@node01 ~]# 

  提示:可以看到對應節點有cni0介面,也有flannel.1介面;

  在node01節點上抓cni0上的包

[root@node01 ~]# tcpdump -i cni0 -nn icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cni0, link-type EN10MB (Ethernet), capture size 262144 bytes
18:55:18.469861 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 225, length 64
18:55:18.470073 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 225, length 64
18:55:19.471439 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 226, length 64
18:55:19.471575 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 226, length 64
18:55:20.472470 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 227, length 64
18:55:20.472608 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 227, length 64
18:55:21.473084 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 228, length 64
18:55:21.473223 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 228, length 64
18:55:22.474856 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 229, length 64
18:55:22.474922 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 229, length 64
18:55:23.475499 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 230, length 64
18:55:23.475685 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 230, length 64
18:55:24.476694 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 231, length 64
18:55:24.476854 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 231, length 64
^C
14 packets captured
14 packets received by filter
0 packets dropped by kernel
[root@node01 ~]# 

  提示:可以看到在cni0上能夠看到10.244.2.9在ping10.244.1.3;說明pod和pod通訊首先會通過cni0這個介面;

  在node01上抓flainnel.1介面上的icmp包

[root@node01 ~]# tcpdump -i flannel.1 -nn icmp     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
18:57:03.607093 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 330, length 64
18:57:03.607273 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 330, length 64
18:57:04.607604 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 331, length 64
18:57:04.607819 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 331, length 64
18:57:05.608172 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 332, length 64
18:57:05.608369 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 332, length 64
18:57:06.609825 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 333, length 64
18:57:06.610106 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 333, length 64
18:57:07.610310 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 334, length 64
18:57:07.612417 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 334, length 64
^C
10 packets captured
10 packets received by filter
0 packets dropped by kernel
[root@node01 ~]# 

  提示:在node01上的flannel.1介面上抓icmp包,能夠正常看到10.244.2.9在ping10.244.1.3;說明對應報文來到了flannel.1介面;

  在node01上的物理介面上抓icmp包,看看是否能抓到對應的icmp包呢?

[root@node01 ~]# tcpdump -i eth0 -nn icmp         
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

  提示:可以看到在node01上的物理介面上抓icmp型別的包,一個都沒有抓到,其原因是對應報文通過隧道介面封裝後,在物理介面上不是icmp型別的包了;

  在node01的物理介面上抓node02的ip地址的包,看看會抓到什麼?

[root@node01 ~]# tcpdump -i eth0 -nn host 172.16.11.5
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
19:02:36.139552 IP 172.16.11.5.46521 > 172.16.11.4.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 662, length 64
19:02:36.139935 IP 172.16.11.4.57232 > 172.16.11.5.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 662, length 64
19:02:37.143339 IP 172.16.11.5.46521 > 172.16.11.4.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 663, length 64
19:02:37.143587 IP 172.16.11.4.57232 > 172.16.11.5.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 663, length 64
19:02:38.144569 IP 172.16.11.5.46521 > 172.16.11.4.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 664, length 64
19:02:38.145276 IP 172.16.11.4.57232 > 172.16.11.5.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 664, length 64
19:02:39.144889 IP 172.16.11.5.46521 > 172.16.11.4.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 665, length 64
19:02:39.145126 IP 172.16.11.4.57232 > 172.16.11.5.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 665, length 64
19:02:40.145727 IP 172.16.11.5.46521 > 172.16.11.4.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 13568, seq 666, length 64
19:02:40.145976 IP 172.16.11.4.57232 > 172.16.11.5.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 13568, seq 666, length 64
^C
10 packets captured
10 packets received by filter
0 packets dropped by kernel
[root@node01 ~]# 

  提示:從上面的抓包資訊可以看到,在node01節點物理介面上收到來自node02物理節點上的包,外層是兩個節點ip地址通訊,裡面承載了對應的podip;通過上述驗證,可以發現在k8s上pod和pod通訊,的確沒有做任何nat,而是藉助vxlan隧道實現的pod和pod直接通訊;

  更改flannel工作為直接路由模式,使pod與pod網路通訊不經由flannel.1介面,直接將流量傳送給物理介面

  提示:在flannel的配置檔案中的backend配置段中,加上“DirectRouting”: true配置資訊,這裡需要注意加了此配置,上面的type後面要有逗號分隔;修改完成以後,儲存退出即可;

  刪除原有的flannel pod,讓其自動重新新建flannel pod,應用新配置

  刪除前檢視節點的路由資訊

[root@master01 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.16.11.2     0.0.0.0         UG    0      0        0 eth0
0.0.0.0         172.16.11.2     0.0.0.0         UG    100    0        0 eth0
10.244.0.0      0.0.0.0         255.255.255.0   U     0      0        0 cni0
10.244.1.0      10.244.1.0      255.255.255.0   UG    0      0        0 flannel.1
10.244.2.0      10.244.2.0      255.255.255.0   UG    0      0        0 flannel.1
10.244.3.0      10.244.3.0      255.255.255.0   UG    0      0        0 flannel.1
169.254.0.0     0.0.0.0         255.255.0.0     U     1002   0        0 eth0
172.16.11.0     0.0.0.0         255.255.255.0   U     100    0        0 eth0
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
[root@master01 ~]# 

  提示:可以看到刪除原有flannel pod前,對應節點路由資訊中,對應10.244.1.0/24、10.244.2.0/24和3.0/24的網路都是指向flannel.1這個介面;

  刪除原有的flannel pod

[root@master01 ~]# kubectl get pods -n kube-system --show-labels
NAME                                       READY   STATUS    RESTARTS   AGE   LABELS
coredns-7f89b7bc75-9s8wr                   1/1     Running   0          11d   k8s-app=kube-dns,pod-template-hash=7f89b7bc75
coredns-7f89b7bc75-ck8gl                   1/1     Running   0          11d   k8s-app=kube-dns,pod-template-hash=7f89b7bc75
etcd-master01.k8s.org                      1/1     Running   1          11d   component=etcd,tier=control-plane
kube-apiserver-master01.k8s.org            1/1     Running   1          11d   component=kube-apiserver,tier=control-plane
kube-controller-manager-master01.k8s.org   1/1     Running   3          11d   component=kube-controller-manager,tier=control-plane
kube-flannel-ds-2z7sk                      1/1     Running   2          11d   app=flannel,controller-revision-hash=64465d999,pod-template-generation=1,tier=node
kube-flannel-ds-57fng                      1/1     Running   0          11d   app=flannel,controller-revision-hash=64465d999,pod-template-generation=1,tier=node
kube-flannel-ds-vt2jv                      1/1     Running   0          11d   app=flannel,controller-revision-hash=64465d999,pod-template-generation=1,tier=node
kube-flannel-ds-wk52c                      1/1     Running   2          11d   app=flannel,controller-revision-hash=64465d999,pod-template-generation=1,tier=node
kube-proxy-2hcd9                           1/1     Running   0          11d   controller-revision-hash=c449f5b75,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-m9s45                           1/1     Running   0          11d   controller-revision-hash=c449f5b75,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-mh9nx                           1/1     Running   0          11d   controller-revision-hash=c449f5b75,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-t57x8                           1/1     Running   0          11d   controller-revision-hash=c449f5b75,k8s-app=kube-proxy,pod-template-generation=1
kube-scheduler-master01.k8s.org            1/1     Running   3          11d   component=kube-scheduler,tier=control-plane
[root@master01 ~]# kubectl delete pod -l app=flannel -n kube-system
pod "kube-flannel-ds-2z7sk" deleted
pod "kube-flannel-ds-57fng" deleted
pod "kube-flannel-ds-vt2jv" deleted
pod "kube-flannel-ds-wk52c" deleted
[root@master01 ~]# kubectl get pods -n kube-system 
NAME                                       READY   STATUS    RESTARTS   AGE
coredns-7f89b7bc75-9s8wr                   1/1     Running   0          11d
coredns-7f89b7bc75-ck8gl                   1/1     Running   0          11d
etcd-master01.k8s.org                      1/1     Running   1          11d
kube-apiserver-master01.k8s.org            1/1     Running   1          11d
kube-controller-manager-master01.k8s.org   1/1     Running   3          11d
kube-flannel-ds-9ww8d                      1/1     Running   0          39s
kube-flannel-ds-gd45l                      1/1     Running   0          79s
kube-flannel-ds-ps6c5                      1/1     Running   0          27s
kube-flannel-ds-x642z                      1/1     Running   0          70s
kube-proxy-2hcd9                           1/1     Running   0          11d
kube-proxy-m9s45                           1/1     Running   0          11d
kube-proxy-mh9nx                           1/1     Running   0          11d
kube-proxy-t57x8                           1/1     Running   0          11d
kube-scheduler-master01.k8s.org            1/1     Running   3          11d
[root@master01 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.16.11.2     0.0.0.0         UG    0      0        0 eth0
0.0.0.0         172.16.11.2     0.0.0.0         UG    100    0        0 eth0
10.244.0.0      0.0.0.0         255.255.255.0   U     0      0        0 cni0
10.244.1.0      172.16.11.4     255.255.255.0   UG    0      0        0 eth0
10.244.2.0      172.16.11.5     255.255.255.0   UG    0      0        0 eth0
10.244.3.0      172.16.11.6     255.255.255.0   UG    0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     1002   0        0 eth0
172.16.11.0     0.0.0.0         255.255.255.0   U     100    0        0 eth0
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
[root@master01 ~]# 

  提示:可以看到,新建flannel pod後對應的路由資訊就發生變了;現在就沒有任何路由會通過flannel.1介面;

  驗證:進入一個pod內部ping 另一個pod ip,看看對應報文走向

  在節點1抓包

[root@node01 ~]# tcpdump -i cni0 -nn icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cni0, link-type EN10MB (Ethernet), capture size 262144 bytes
19:45:55.693118 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 18944, seq 32, length 64
19:45:55.693285 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 18944, seq 32, length 64
19:45:56.693771 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 18944, seq 33, length 64
19:45:56.693941 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 18944, seq 33, length 64
19:45:57.695549 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 18944, seq 34, length 64
19:45:57.695905 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 18944, seq 34, length 64
19:45:58.696517 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 18944, seq 35, length 64
19:45:58.697035 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 18944, seq 35, length 64
^C
8 packets captured
8 packets received by filter
0 packets dropped by kernel
[root@node01 ~]# tcpdump -i flannel.1 -nn icmp        
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
[root@node01 ~]# tcpdump -i eth0 -nn icmp         
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
19:46:24.737002 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 18944, seq 61, length 64
19:46:24.737350 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 18944, seq 61, length 64
19:46:25.737664 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 18944, seq 62, length 64
19:46:25.737987 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 18944, seq 62, length 64
19:46:26.739459 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 18944, seq 63, length 64
19:46:26.739705 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 18944, seq 63, length 64
19:46:27.739800 IP 10.244.2.9 > 10.244.1.3: ICMP echo request, id 18944, seq 64, length 64
19:46:27.740026 IP 10.244.1.3 > 10.244.2.9: ICMP echo reply, id 18944, seq 64, length 64
^C
8 packets captured
8 packets received by filter
0 packets dropped by kernel
[root@node01 ~]# 

  提示:可以看到在node01的flannel.1介面上就抓不到icmp型別的包,在對應物理介面上能夠抓到icmp型別的包,並且從抓包資訊中也能看到對應是10.244.2.9在ping 10.244.1.3;

相關文章