I'm playing with Kubernetes inside 3 VirtualBox VMs with CentOS 7, 1 master and 2 minions. Unfortunately installation manuals say something like every service will be accessible from every node, every pod will see all other pods
, but I don't see this happening. I can access the service only from that node where the specific pod runs. Please help to find out what am I missing, I'm very new to Kubernetes.
Every VM has 2 adapters: NAT and Host-only. Second one has IPs 10.0.13.101-254.
- Master-1: 10.0.13.104
- Minion-1: 10.0.13.105
- Minion-2: 10.0.13.106
Get all pods from master:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox 1/1 Running 4 37m
default nginx-demo-2867147694-f6f9m 1/1 Running 1 52m
default nginx-demo2-2631277934-v4ggr 1/1 Running 0 5s
kube-system etcd-master-1 1/1 Running 1 1h
kube-system kube-apiserver-master-1 1/1 Running 1 1h
kube-system kube-controller-manager-master-1 1/1 Running 1 1h
kube-system kube-dns-2425271678-kgb7k 3/3 Running 3 1h
kube-system kube-flannel-ds-pwsq4 2/2 Running 4 56m
kube-system kube-flannel-ds-qswt7 2/2 Running 4 1h
kube-system kube-flannel-ds-z0g8c 2/2 Running 12 56m
kube-system kube-proxy-0lfw0 1/1 Running 2 56m
kube-system kube-proxy-6263z 1/1 Running 2 56m
kube-system kube-proxy-b8hc3 1/1 Running 1 1h
kube-system kube-scheduler-master-1 1/1 Running 1 1h
Get all services:
$ kubectl get services
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.96.0.1 <none> 443/TCP 1h
nginx-demo 10.104.34.229 <none> 80/TCP 51m
nginx-demo2 10.102.145.89 <none> 80/TCP 3s
Get Nginx pods IP info:
$ kubectl get pod nginx-demo-2867147694-f6f9m -o json | grep IP
"hostIP": "10.0.13.105",
"podIP": "10.244.1.58",
$ kubectl get pod nginx-demo2-2631277934-v4ggr -o json | grep IP
"hostIP": "10.0.13.106",
"podIP": "10.244.2.14",
As you see - one Nginx example is on the first minion, and the second example is on the second minion.
The problem is - I can access nginx-demo from node 10.0.13.105 only (with Pod IP and Service IP), with curl:
curl 10.244.1.58:80
curl 10.104.34.229:80
, and nginx-demo2 from 10.0.13.106 only, not vice versa and not from master-node. Busybox is on node 10.0.13.105, so it can reach nginx-demo, but not nginx-demo2.
How do I make access to the service node-independently? Is flannel misconfigured?
Routing table on master:
$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.2.2 0.0.0.0 UG 100 0 0 enp0s3
10.0.2.0 0.0.0.0 255.255.255.0 U 100 0 0 enp0s3
10.0.13.0 0.0.0.0 255.255.255.0 U 100 0 0 enp0s8
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
10.244.0.0 0.0.0.0 255.255.0.0 U 0 0 0 flannel.1
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
Routing table on minion-1:
# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.2.2 0.0.0.0 UG 100 0 0 enp0s3
10.0.2.0 0.0.0.0 255.255.255.0 U 100 0 0 enp0s3
10.0.13.0 0.0.0.0 255.255.255.0 U 100 0 0 enp0s8
10.244.0.0 0.0.0.0 255.255.0.0 U 0 0 0 flannel.1
10.244.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
Maybe default gateway is a problem (NAT adapter)? Another problem - from Busybox container I try to do services DNS resolving and it doesn't work too:
$ kubectl run -i --tty busybox --image=busybox --generator="run-pod/v1"
If you don't see a command prompt, try pressing enter.
/ #
/ # nslookup nginx-demo
Server: 10.96.0.10
Address 1: 10.96.0.10
nslookup: can't resolve 'nginx-demo'
/ #
/ # nslookup nginx-demo.default.svc.cluster.local
Server: 10.96.0.10
Address 1: 10.96.0.10
nslookup: can't resolve 'nginx-demo.default.svc.cluster.local'
Decreasing Guest OS security was done:
setenforce 0
systemctl stop firewalld
Feel free to ask more info if you need.
Addional info
kube-dns logs:
$ kubectl -n kube-system logs kube-dns-2425271678-kgb7k kubedns
I0919 07:48:45.000397 1 dns.go:48] version: 1.14.3-4-gee838f6
I0919 07:48:45.114060 1 server.go:70] Using configuration read from directory: /kube-dns-config with period 10s
I0919 07:48:45.114129 1 server.go:113] FLAG: --alsologtostderr="false"
I0919 07:48:45.114144 1 server.go:113] FLAG: --config-dir="/kube-dns-config"
I0919 07:48:45.114155 1 server.go:113] FLAG: --config-map=""
I0919 07:48:45.114162 1 server.go:113] FLAG: --config-map-namespace="kube-system"
I0919 07:48:45.114169 1 server.go:113] FLAG: --config-period="10s"
I0919 07:48:45.114179 1 server.go:113] FLAG: --dns-bind-address="0.0.0.0"
I0919 07:48:45.114186 1 server.go:113] FLAG: --dns-port="10053"
I0919 07:48:45.114196 1 server.go:113] FLAG: --domain="cluster.local."
I0919 07:48:45.114206 1 server.go:113] FLAG: --federations=""
I0919 07:48:45.114215 1 server.go:113] FLAG: --healthz-port="8081"
I0919 07:48:45.114223 1 server.go:113] FLAG: --initial-sync-timeout="1m0s"
I0919 07:48:45.114230 1 server.go:113] FLAG: --kube-master-url=""
I0919 07:48:45.114238 1 server.go:113] FLAG: --kubecfg-file=""
I0919 07:48:45.114245 1 server.go:113] FLAG: --log-backtrace-at=":0"
I0919 07:48:45.114256 1 server.go:113] FLAG: --log-dir=""
I0919 07:48:45.114264 1 server.go:113] FLAG: --log-flush-frequency="5s"
I0919 07:48:45.114271 1 server.go:113] FLAG: --logtostderr="true"
I0919 07:48:45.114278 1 server.go:113] FLAG: --nameservers=""
I0919 07:48:45.114285 1 server.go:113] FLAG: --stderrthreshold="2"
I0919 07:48:45.114292 1 server.go:113] FLAG: --v="2"
I0919 07:48:45.114299 1 server.go:113] FLAG: --version="false"
I0919 07:48:45.114310 1 server.go:113] FLAG: --vmodule=""
I0919 07:48:45.116894 1 server.go:176] Starting SkyDNS server (0.0.0.0:10053)
I0919 07:48:45.117296 1 server.go:198] Skydns metrics enabled (/metrics:10055)
I0919 07:48:45.117329 1 dns.go:147] Starting endpointsController
I0919 07:48:45.117336 1 dns.go:150] Starting serviceController
I0919 07:48:45.117702 1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0919 07:48:45.117716 1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0919 07:48:45.620177 1 dns.go:171] Initialized services and endpoints from apiserver
I0919 07:48:45.620217 1 server.go:129] Setting up Healthz Handler (/readiness)
I0919 07:48:45.620229 1 server.go:134] Setting up cache handler (/cache)
I0919 07:48:45.620238 1 server.go:120] Status HTTP port 8081
$ kubectl -n kube-system logs kube-dns-2425271678-kgb7k dnsmasq
I0919 07:48:48.466499 1 main.go:76] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I0919 07:48:48.478353 1 nanny.go:86] Starting dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053]
I0919 07:48:48.697877 1 nanny.go:111]
W0919 07:48:48.697903 1 nanny.go:112] Got EOF from stdout
I0919 07:48:48.697925 1 nanny.go:108] dnsmasq[10]: started, version 2.76 cachesize 1000
I0919 07:48:48.697937 1 nanny.go:108] dnsmasq[10]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I0919 07:48:48.697943 1 nanny.go:108] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0919 07:48:48.697947 1 nanny.go:108] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0919 07:48:48.697950 1 nanny.go:108] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0919 07:48:48.697955 1 nanny.go:108] dnsmasq[10]: reading /etc/resolv.conf
I0919 07:48:48.697959 1 nanny.go:108] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0919 07:48:48.697962 1 nanny.go:108] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0919 07:48:48.697965 1 nanny.go:108] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0919 07:48:48.697968 1 nanny.go:108] dnsmasq[10]: using nameserver 85.254.193.137#53
I0919 07:48:48.697971 1 nanny.go:108] dnsmasq[10]: using nameserver 92.240.64.23#53
I0919 07:48:48.697975 1 nanny.go:108] dnsmasq[10]: read /etc/hosts - 7 addresses
$ kubectl -n kube-system logs kube-dns-2425271678-kgb7k sidecar
ERROR: logging before flag.Parse: I0919 07:48:49.990468 1 main.go:48] Version v1.14.3-4-gee838f6
ERROR: logging before flag.Parse: I0919 07:48:49.994335 1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns})
ERROR: logging before flag.Parse: I0919 07:48:49.994369 1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}
ERROR: logging before flag.Parse: I0919 07:48:49.994435 1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}
kube-flannel logs from one pod. but is similar to the others:
$ kubectl -n kube-system logs kube-flannel-ds-674mx kube-flannel
I0919 08:07:41.577954 1 main.go:446] Determining IP address of default interface
I0919 08:07:41.579363 1 main.go:459] Using interface with name enp0s3 and address 10.0.2.15
I0919 08:07:41.579408 1 main.go:476] Defaulting external address to interface address (10.0.2.15)
I0919 08:07:41.600985 1 kube.go:130] Waiting 10m0s for node controller to sync
I0919 08:07:41.601032 1 kube.go:283] Starting kube subnet manager
I0919 08:07:42.601553 1 kube.go:137] Node controller sync successful
I0919 08:07:42.601959 1 main.go:226] Created subnet manager: Kubernetes Subnet Manager - minion-1
I0919 08:07:42.601966 1 main.go:229] Installing signal handlers
I0919 08:07:42.602036 1 main.go:330] Found network config - Backend type: vxlan
I0919 08:07:42.606970 1 ipmasq.go:51] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0919 08:07:42.608380 1 ipmasq.go:51] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
I0919 08:07:42.609579 1 ipmasq.go:51] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.1.0/24 -j RETURN
I0919 08:07:42.611257 1 ipmasq.go:51] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE
I0919 08:07:42.612595 1 main.go:279] Wrote subnet file to /run/flannel/subnet.env
I0919 08:07:42.612606 1 main.go:284] Finished starting backend.
I0919 08:07:42.612638 1 vxlan_network.go:56] Watching for L3 misses
I0919 08:07:42.612651 1 vxlan_network.go:64] Watching for new subnet leases
$ kubectl -n kube-system logs kube-flannel-ds-674mx install-cni
+ cp -f /etc/kube-flannel/cni-conf.json /etc/cni/net.d/10-flannel.conf
+ true
+ sleep 3600
+ true
+ sleep 3600
I've added some more services and exposed with type NodePort, this is what I get when scanning ports from host machine:
# nmap 10.0.13.104 -p1-50000
Starting Nmap 7.60 ( https://nmap.org ) at 2017-09-19 12:20 EEST
Nmap scan report for 10.0.13.104
Host is up (0.0014s latency).
Not shown: 49992 closed ports
PORT STATE SERVICE
22/tcp open ssh
6443/tcp open sun-sr-https
10250/tcp open unknown
10255/tcp open unknown
10256/tcp open unknown
30029/tcp filtered unknown
31844/tcp filtered unknown
32619/tcp filtered unknown
MAC Address: 08:00:27:90:26:1C (Oracle VirtualBox virtual NIC)
Nmap done: 1 IP address (1 host up) scanned in 1.96 seconds
# nmap 10.0.13.105 -p1-50000
Starting Nmap 7.60 ( https://nmap.org ) at 2017-09-19 12:20 EEST
Nmap scan report for 10.0.13.105
Host is up (0.00040s latency).
Not shown: 49993 closed ports
PORT STATE SERVICE
22/tcp open ssh
10250/tcp open unknown
10255/tcp open unknown
10256/tcp open unknown
30029/tcp open unknown
31844/tcp open unknown
32619/tcp filtered unknown
MAC Address: 08:00:27:F8:E3:71 (Oracle VirtualBox virtual NIC)
Nmap done: 1 IP address (1 host up) scanned in 1.87 seconds
# nmap 10.0.13.106 -p1-50000
Starting Nmap 7.60 ( https://nmap.org ) at 2017-09-19 12:21 EEST
Nmap scan report for 10.0.13.106
Host is up (0.00059s latency).
Not shown: 49993 closed ports
PORT STATE SERVICE
22/tcp open ssh
10250/tcp open unknown
10255/tcp open unknown
10256/tcp open unknown
30029/tcp filtered unknown
31844/tcp filtered unknown
32619/tcp open unknown
MAC Address: 08:00:27:D9:33:32 (Oracle VirtualBox virtual NIC)
Nmap done: 1 IP address (1 host up) scanned in 1.92 seconds
If we take the latest service on port 32619 - it exists everywhere, but is Open only on related node, on the others it's filtered.
tcpdump info on Minion-1
Connection from Host to Minion-1 with curl 10.0.13.105:30572
:
# tcpdump -ni enp0s8 tcp or icmp and not port 22 and not host 10.0.13.104
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp0s8, link-type EN10MB (Ethernet), capture size 262144 bytes
13:11:39.043874 IP 10.0.13.1.41132 > 10.0.13.105.30572: Flags [S], seq 657506957, win 29200, options [mss 1460,sackOK,TS val 504213496 ecr 0,nop,wscale 7], length 0
13:11:39.045218 IP 10.0.13.105 > 10.0.13.1: ICMP time exceeded in-transit, length 68
on flannel.1 interface:
# tcpdump -ni flannel.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
13:11:49.499148 IP 10.244.1.0.41134 > 10.244.2.38.http: Flags [S], seq 2858453268, win 29200, options [mss 1460,sackOK,TS val 504216633 ecr 0,nop,wscale 7], length 0
13:11:49.499074 IP 10.244.1.0.41134 > 10.244.2.38.http: Flags [S], seq 2858453268, win 29200, options [mss 1460,sackOK,TS val 504216633 ecr 0,nop,wscale 7], length 0
13:11:49.499239 IP 10.244.1.0.41134 > 10.244.2.38.http: Flags [S], seq 2858453268, win 29200, options [mss 1460,sackOK,TS val 504216633 ecr 0,nop,wscale 7], length 0
13:11:49.499074 IP 10.244.1.0.41134 > 10.244.2.38.http: Flags [S], seq 2858453268, win 29200, options [mss 1460,sackOK,TS val 504216633 ecr 0,nop,wscale 7], length 0
13:11:49.499247 IP 10.244.1.0.41134 > 10.244.2.38.http: Flags [S], seq 2858453268, win 29200, options [mss 1460,sackOK,TS val 504216633 ecr 0,nop,wscale 7], length 0
.. ICMP time exceeded in-transit
error and SYN packets only, so no connection between pods networks, because curl 10.0.13.106:30572
works.
Minion-1 interfaces
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:35:72:ab brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3
valid_lft 77769sec preferred_lft 77769sec
inet6 fe80::772d:2128:6aaa:2355/64 scope link
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:f8:e3:71 brd ff:ff:ff:ff:ff:ff
inet 10.0.13.105/24 brd 10.0.13.255 scope global dynamic enp0s8
valid_lft 1089sec preferred_lft 1089sec
inet6 fe80::1fe0:dba7:110d:d673/64 scope link
valid_lft forever preferred_lft forever
inet6 fe80::f04f:5413:2d27:ab55/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
link/ether 02:42:59:53:d7:fd brd ff:ff:ff:ff:ff:ff
inet 10.244.1.2/24 scope global docker0
valid_lft forever preferred_lft forever
5: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN
link/ether fa:d3:3e:3e:77:19 brd ff:ff:ff:ff:ff:ff
inet 10.244.1.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::f8d3:3eff:fe3e:7719/64 scope link
valid_lft forever preferred_lft forever
6: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP qlen 1000
link/ether 0a:58:0a:f4:01:01 brd ff:ff:ff:ff:ff:ff
inet 10.244.1.1/24 scope global cni0
valid_lft forever preferred_lft forever
inet6 fe80::c4f9:96ff:fed8:8cb6/64 scope link
valid_lft forever preferred_lft forever
13: veth5e2971fe@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP
link/ether 1e:70:5d:6c:55:33 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::1c70:5dff:fe6c:5533/64 scope link
valid_lft forever preferred_lft forever
14: veth8f004069@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP
link/ether ca:39:96:59:e6:63 brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::c839:96ff:fe59:e663/64 scope link
valid_lft forever preferred_lft forever
15: veth5742dc0d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP
link/ether c2:48:fa:41:5d:67 brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet6 fe80::c048:faff:fe41:5d67/64 scope link
valid_lft forever preferred_lft forever
It works either by disabling the firewall or by running below command.
I found this open bug in my search. Looks like this is related to docker >=1.13 and flannel
refer: https://github.com/coreos/flannel/issues/799
I am not good at the network. we are in the same situation with you, we set up four virtual machines and one is for master, else are worker nodes. I tried to use nslookup some service using in some container in the pod, but it failed to lookup, stuck on getting response from kubernetes dns. I realize that the dns configuration or the network component is not right, thus look into the log of the canal(we use this CNI to establish the kubernete network), and find that it is initialized with the default interface which seems used by NAT but not the host-only one as below. We then rectify it, and it works now.
https://raw.githubusercontent.com/projectcalico/canal/master/k8s-install/1.7/canal.yaml
Not sure which CNI you use, but hope this could help you to check.