默认情况下每个pod都会分配一个私有ip,vpc中的pod使用这个私有ip直接通信。当pod于不同vpc的cidr块通信时,vpccni插件会将源ip转换为节点主网卡的ip。
由于以上特性
(1)pod -> ec2
在ec2(172.31.18.4)上启动nginx,pod在节点上(192.168.22.226),从pod(192.168.20.98)中访问ec2
192.168.22.226 - - [14/Mar/2023:11:36:35 +0000] "GET / HTTP/1.1" 200 615 "-" "curl/7.74.0" "-"
可见从pod中访问ec2实际上是通过节点的ip访问的
(2)pod -> pod
从pod(192.168.26.75)上访问pod(192.168.20.98)
192.168.26.75 - - [14/Mar/2023:11:42:56 +0000] "GET / HTTP/1.1" 200 615 "-" "curl/7.74.0" "-"
可见从pod中访问pod实际上是直接通过私有ip访问的
(3)ec2 -> pod
从ec2(172.31.18.4)上访问pod(192.168.20.98)
172.31.18.4 - - [14/Mar/2023:11:39:07 +0000] "GET / HTTP/1.1" 200 615 "-" "curl/7.79.1" "-"
ec2上成功收到响应,可见从pod中访问ec2实际上是通过节点的ip访问的
(1)pod -> ec2
在ec2(172.31.18.4)上启动nginx,pod在节点上(192.168.22.226),从pod(192.168.26.75)中访问ec2
192.168.22.226 - - [14/Mar/2023:12:00:21 +0000] "GET / HTTP/1.1" 200 615 "-" "curl/7.74.0" "-"
可见从pod中访问ec2实际上是通过节点的ip访问的
(2)pod -> pod
从pod(192.168.26.75)上访问非主网卡的pod(192.168.26.159)
192.168.26.75 - - [14/Mar/2023:11:56:22 +0000] "GET / HTTP/1.1" 200 615 "-" "curl/7.74.0" "-"
可见从pod中访问pod实际上是直接通过私有ip访问的
(3)ec2 -> pod
从ec2(172.31.18.4)上访问pod(192.168.26.159)
172.31.18.4 - - [14/Mar/2023:12:00:59 +0000] "GET / HTTP/1.1" 200 615 "-" "curl/7.79.1" "-"
ec2上成功收到响应,可见从pod中访问ec2实际上是通过节点的ip访问的
结果没有什么区别,很奇怪的一点,理论上ec2访问非主网卡pod出现发包和回包ip不一致的情况
pod访问ec2,在节点上抓包,实际上是通过主网卡ip通信的
$ sudo tcpdump -n \(host 172.31.18.4 or host 192.168.26.75 \)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
12:39:43.748042 IP 192.168.22.226.63690 > 172.31.18.4.8090: Flags [S], seq 2531568171, win 62727, options [mss 8961,sackOK,TS val 500176034 ecr 0,nop,wscale 7], length 0
12:39:43.748344 IP 172.31.18.4.8090 > 192.168.22.226.63690: Flags [S.], seq 1461875833, ack 2531568172, win 65160, options [mss 1460,sackOK,TS val 2880838298 ecr 500176034,nop,wscale 7], length 0
12:39:43.748421 IP 192.168.22.226.63690 > 172.31.18.4.8090: Flags [.], ack 1, win 491, options [nop,nop,TS val 500176035 ecr 2880838298], length 0
12:39:43.748463 IP 192.168.22.226.63690 > 172.31.18.4.8090: Flags [P.], seq 1:81, ack 1, win 491, options [nop,nop,TS val 500176035 ecr 2880838298], length 80
12:39:43.748678 IP 172.31.18.4.8090 > 192.168.22.226.63690: Flags [.], ack 81, win 509, options [nop,nop,TS val 2880838298 ecr 500176035], length 0
12:39:43.748869 IP 172.31.18.4.8090 > 192.168.22.226.63690: Flags [P.], seq 1:239, ack 81, win 509, options [nop,nop,TS val 2880838298 ecr 500176035], length 238
12:39:43.748887 IP 192.168.22.226.63690 > 172.31.18.4.8090: Flags [.], ack 239, win 490, options [nop,nop,TS val 500176035 ecr 2880838298], length 0
12:39:43.748909 IP 172.31.18.4.8090 > 192.168.22.226.63690: Flags [P.], seq 239:854, ack 81, win 509, options [nop,nop,TS val 2880838298 ecr 500176035], length 615
12:39:43.748930 IP 192.168.22.226.63690 > 172.31.18.4.8090: Flags [.], ack 854, win 486, options [nop,nop,TS val 500176035 ecr 2880838298], length 0
12:39:43.749081 IP 192.168.22.226.63690 > 172.31.18.4.8090: Flags [F.], seq 81, ack 854, win 486, options [nop,nop,TS val 500176036 ecr 2880838298], length 0
12:39:43.749318 IP 172.31.18.4.8090 > 192.168.22.226.63690: Flags [F.], seq 854, ack 82, win 509, options [nop,nop,TS val 2880838299 ecr 500176036], length 0
12:39:43.749347 IP 192.168.22.226.63690 > 172.31.18.4.8090: Flags [.], ack 855, win 486, options [nop,nop,TS val 500176036 ecr 2880838299], length 0
ec2访问pod,pod直接和ec2通信
sudo tcpdump -n \( host 192.168.26.75 or host 192.168.22.226 \)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
12:37:53.990117 IP 172.31.18.4.50666 > 192.168.26.75.http: Flags [S], seq 111765395, win 62727, options [mss 8961,sackOK,TS val 1045667994 ecr 0,nop,wscale 7], length 0
12:37:53.990429 IP 192.168.26.75.http > 172.31.18.4.50666: Flags [S.], seq 616787041, ack 111765396, win 62643, options [mss 8961,sackOK,TS val 500066277 ecr 1045667994,nop,wscale 7], length 0
12:37:53.990450 IP 172.31.18.4.50666 > 192.168.26.75.http: Flags [.], ack 1, win 491, options [nop,nop,TS val 1045667994 ecr 500066277], length 0
12:37:53.990486 IP 172.31.18.4.50666 > 192.168.26.75.http: Flags [P.], seq 1:78, ack 1, win 491, options [nop,nop,TS val 1045667994 ecr 500066277], length 77: HTTP: GET / HTTP/1.1
12:37:53.990649 IP 192.168.26.75.http > 172.31.18.4.50666: Flags [.], ack 78, win 489, options [nop,nop,TS val 500066277 ecr 1045667994], length 0
12:37:53.990745 IP 192.168.26.75.http > 172.31.18.4.50666: Flags [P.], seq 1:239, ack 78, win 489, options [nop,nop,TS val 500066277 ecr 1045667994], length 238: HTTP: HTTP/1.1 200 OK
12:37:53.990760 IP 172.31.18.4.50666 > 192.168.26.75.http: Flags [.], ack 239, win 490, options [nop,nop,TS val 1045667994 ecr 500066277], length 0
12:37:53.990784 IP 192.168.26.75.http > 172.31.18.4.50666: Flags [P.], seq 239:854, ack 78, win 489, options [nop,nop,TS val 500066277 ecr 1045667994], length 615: HTTP
12:37:53.990789 IP 172.31.18.4.50666 > 192.168.26.75.http: Flags [.], ack 854, win 486, options [nop,nop,TS val 1045667994 ecr 500066277], length 0
12:37:53.991040 IP 172.31.18.4.50666 > 192.168.26.75.http: Flags [F.], seq 78, ack 854, win 486, options [nop,nop,TS val 1045667995 ecr 500066277], length 0
12:37:53.991210 IP 192.168.26.75.http > 172.31.18.4.50666: Flags [F.], seq 854, ack 79, win 489, options [nop,nop,TS val 500066278 ecr 1045667995], length 0
12:37:53.991221 IP 172.31.18.4.50666 > 192.168.26.75.http: Flags [.], ack 855, win 486, options [nop,nop,TS val 1045667995 ecr 500066278], length 0
关于AWS_VPC_K8S_CNI_EXTERNALSNAT参数
Specifies whether an external NAT gateway should be used to provide SNAT of secondary ENI IP addresses. If set to
true
, the SNATiptables
rule and off-VPC IP rule are not applied, and these rules are removed if they have already been applied. Disable SNAT if you need to allow inbound communication to your pods from external VPNs, direct connections, and external VPCs, and your pods do not need to access the Internet directly via an Internet Gateway. However, your nodes must be running in a private subnet and connected to the internet through an AWS NAT Gateway or another external NAT device.
开启这个参数
此时pod(192.168.26.75)访问ec2(172.31.18.4),在节点上抓不到包了
sudo tcpdump -n \( host 192.168.26.75 or host 172.31.18.4 \)
在ec2上抓包,此时ip变成了pod的ip
$ sudo tcpdump -n \( host 192.168.26.75 or host 192.168.22.226 \)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
12:56:00.049250 IP 192.168.26.75.57494 > 172.31.18.4.8090: Flags [S], seq 3156073624, win 62727, options [mss 8961,sackOK,TS val 501152336 ecr 0,nop,wscale 7], length 0
12:56:00.049345 IP 172.31.18.4.8090 > 192.168.26.75.57494: Flags [S.], seq 807675840, ack 3156073625, win 65160, options [mss 1460,sackOK,TS val 2511006629 ecr 501152336,nop,wscale 7], length 0
12:56:00.049558 IP 192.168.26.75.57494 > 172.31.18.4.8090: Flags [.], ack 1, win 491, options [nop,nop,TS val 501152336 ecr 2511006629], length 0
12:56:00.049602 IP 192.168.26.75.57494 > 172.31.18.4.8090: Flags [P.], seq 1:81, ack 1, win 491, options [nop,nop,TS val 501152336 ecr 2511006629], length 80
12:56:00.049617 IP 172.31.18.4.8090 > 192.168.26.75.57494: Flags [.], ack 81, win 509, options [nop,nop,TS val 2511006629 ecr 501152336], length 0
12:56:00.049716 IP 172.31.18.4.8090 > 192.168.26.75.57494: Flags [P.], seq 1:239, ack 81, win 509, options [nop,nop,TS val 2511006629 ecr 501152336], length 238
12:56:00.049758 IP 172.31.18.4.8090 > 192.168.26.75.57494: Flags [P.], seq 239:854, ack 81, win 509, options [nop,nop,TS val 2511006629 ecr 501152336], length 615
12:56:00.049869 IP 192.168.26.75.57494 > 172.31.18.4.8090: Flags [.], ack 239, win 490, options [nop,nop,TS val 501152336 ecr 2511006629], length 0
12:56:00.049904 IP 192.168.26.75.57494 > 172.31.18.4.8090: Flags [.], ack 854, win 486, options [nop,nop,TS val 501152336 ecr 2511006629], length 0
12:56:00.050044 IP 192.168.26.75.57494 > 172.31.18.4.8090: Flags [F.], seq 81, ack 854, win 486, options [nop,nop,TS val 501152336 ecr 2511006629], length 0
12:56:00.050085 IP 172.31.18.4.8090 > 192.168.26.75.57494: Flags [F.], seq 854, ack 82, win 509, options [nop,nop,TS val 2511006630 ecr 501152336], length 0
12:56:00.050243 IP 192.168.26.75.57494 > 172.31.18.4.8090: Flags [.], ack 855, win 486, options [nop,nop,TS val 501152337 ecr 2511006630], length 0
此时由于pod不再使用ec2的ip,因此无法访问公网了,以下命令会卡住
curl cip.cc
查看路由表在AWS_VPC_K8S_CNI_EXTERNALSNAT参数为false时有以下路由,在设置为true后以下iptables规则消失
-A AWS-CONNMARK-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS CONNMARK CHAIN, VPC CIDR" -j AWS-CONNMARK-CHAIN-1
-A AWS-CONNMARK-CHAIN-1 -m comment --comment "AWS, CONNMARK" -j CONNMARK --set-xmark 0x80/0x80
-A AWS-SNAT-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS SNAT CHAIN" -j AWS-SNAT-CHAIN-1
-A AWS-SNAT-CHAIN-1 ! -o vlan+ -m comment --comment "AWS, SNAT" -m addrtype ! --dst-type LOCAL -j SNAT --to-source 192.168.22.226 --random-fully
创建节点组
eksctl create nodegroup --cluster testsnat --node-type t3.xlarge --nodes 1 --nodes-max 2 --nodes-min 0 --node-volume-size 30 --node-volume-type gp3 --ssh-access --ssh-public-key cluster-key
逻辑上ec2访问pod没有问题,但是pod返回响应的时候被snat到节点ip上,因此可能会出现访问失败。
重复以上测试,仍旧没有问题,目前不清楚为什么没有出现文档上出现的访问失败的问题