So I've got a kubernets cluster up and running using the Kubernetes on CoreOS Manual Installation Guide.
$ kubectl get no
NAME STATUS AGE
coreos-master-1 Ready,SchedulingDisabled 1h
coreos-worker-1 Ready 54m
$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default curl-2421989462-h0dr7 1/1 Running 1 53m 10.2.26.4 coreos-worker-1
kube-system busybox 1/1 Running 0 55m 10.2.26.3 coreos-worker-1
kube-system kube-apiserver-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-controller-manager-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-worker-1 1/1 Running 0 58m 192.168.0.204 coreos-worker-1
kube-system kube-scheduler-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
$ kubectl get svc --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.3.0.1 <none> 443/TCP 1h
As with the guide, I've setup a service network 10.3.0.0/16
and a pod network 10.2.0.0/16
. Pod network seems fine as busybox and curl containers get IPs. But the services network has problems. Originally, I've encountered this when deploying kube-dns
: the service IP 10.3.0.1
couldn't be reached, so kube-dns couldn't start all containers and DNS was ultimately not working.
From within the curl pod, I can reproduce the issue:
[ root@curl-2421989462-h0dr7:/ ]$ curl https://10.3.0.1
curl: (7) Failed to connect to 10.3.0.1 port 443: No route to host
[ root@curl-2421989462-h0dr7:/ ]$ ip route
default via 10.2.26.1 dev eth0
10.2.0.0/16 via 10.2.26.1 dev eth0
10.2.26.0/24 dev eth0 src 10.2.26.4
It seems ok that there's only a default route in the container. As I understood it, the request (to default route) should be intercepted by the kube-proxy
on the worker node, forwarded to the the proxy on the master node where the IP is translated via iptables to the masters public IP.
There seems to be a common problem with a bridge/netfilter sysctl setting, but that seems fine in my setup:
core@coreos-worker-1 ~ $ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
I'm having a real hard time to troubleshoot, as I lack the understanding of what the service IP is used for, how the service network is supposed to work in terms of traffic flow and how to best debug this.
So here're the questions I have:
- What is the 1st IP of the service network (10.3.0.1 in this case) used for?
- Is above description of the traffic flow correct? If not, what steps does it take for a container to reach a service IP?
- What are the best ways to debug each step in the traffic flow? (I can't get any idea what's wrong from the logs)
Thanks!
I had this same problem, and the ultimate solution that worked for me was enabling IP forwarding on all nodes in the cluster, which I had neglected to do.
Service IPs and DNS started working immediately afterwards.
The Sevice network provides fixed IPs for Services. It is not a routeable network (so don't expect
ip ro
to show anything nor will ping work) but a collection iptables rules managed by kube-proxy on each node (seeiptables -L; iptables -t nat -L
on the nodes, not Pods). These virtual IPs (see the pics!) act as load balancing proxy for endpoints (kubectl get ep
), which are usually ports of Pods (but not always) with a specific set of labels as defined in the Service.The first IP on the Service network is for reaching the kube-apiserver itself. It's listening on port 443 (
kubectl describe svc kubernetes
).Troubleshooting is different on each network/cluster setup. I would generally check:
/etc/kubernetes/manifests/kube-proxy.yaml
userspace
mode. Again, the details depend on your setup. For you it's in the file I mentioned above. Append--proxy-mode=userspace
as a parameter on each nodeIf you leave comments I will get back to you..
I had the same issue, turned out to be a configuration issue in kube-proxy.yaml For the "master" parameter I had the ip address as in - --master=192.168.3.240 but it actually required to be a url like - --master=https://192.168.3.240
FYI my kube-proxy sucessfully uses --proxy-mode=iptables (v1.6.x)