I deployed a few VMs using Vagrant to test kubernetes:
master: 4 CPUs, 4GB RAM
node-1: 4 CPUs, 8GB RAM
Base image: Centos/7.
Networking: Bridged.
Host OS: Centos 7.2
Deployed kubernetes using kubeadm by following kubeadm getting started guide. After adding the node to the cluster and installing Weave Net, I'm unfortunately not able to get kube-dns up and running as it stays in a ContainerCreating state:
[vagrant@master ~]$ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system etcd-master 1/1 Running 0 1h kube-system kube-apiserver-master 1/1 Running 0 1h kube-system kube-controller-manager-master 1/1 Running 0 1h kube-system kube-discovery-982812725-0tiiy 1/1 Running 0 1h kube-system kube-dns-2247936740-46rcz 0/3 ContainerCreating 0 1h kube-system kube-proxy-amd64-4d8s7 1/1 Running 0 1h kube-system kube-proxy-amd64-sqea1 1/1 Running 0 1h kube-system kube-scheduler-master 1/1 Running 0 1h kube-system weave-net-h1om2 2/2 Running 0 1h kube-system weave-net-khebq 1/2 CrashLoopBackOff 17 1h
I assume the problem is somehow related to the weave-net pod in CrashloopBackoff state which resides on node-1:
[vagrant@master ~]$ kubectl describe pods --namespace=kube-system weave-net-khebq
Name: weave-net-khebq
Namespace: kube-system
Node: node-1/10.0.2.15
Start Time: Wed, 05 Oct 2016 07:10:39 +0000
Labels: name=weave-net
Status: Running
IP: 10.0.2.15
Controllers: DaemonSet/weave-net
Containers:
weave:
Container ID: docker://4976cd0ec6f971397aaf7fbfd746ca559322ab3d8f4ee217dd6c8bd3f6ed4f76
Image: weaveworks/weave-kube:1.7.0
Image ID: docker://sha256:1ac5304168bd9dd35c0ecaeb85d77d26c13a7d077aa8629b2a1b4e354cdffa1a
Port:
Command:
/home/weave/launch.sh
Requests:
cpu: 10m
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 05 Oct 2016 08:18:51 +0000
Finished: Wed, 05 Oct 2016 08:18:51 +0000
Ready: False
Restart Count: 18
Liveness: http-get http://127.0.0.1:6784/status delay=30s timeout=1s period=10s #success=1 #failure=3
Volume Mounts:
/etc from cni-conf (rw)
/host_home from cni-bin2 (rw)
/opt from cni-bin (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-kir36 (ro)
/weavedb from weavedb (rw)
Environment Variables:
WEAVE_VERSION: 1.7.0
weave-npc:
Container ID: docker://feef7e7436d2565182d99c9021958619f65aff591c576a0c240ac0adf9c66a0b
Image: weaveworks/weave-npc:1.7.0
Image ID: docker://sha256:4d7f0bd7c0e63517a675e352146af7687a206153e66bdb3d8c7caeb54802b16a
Port:
Requests:
cpu: 10m
State: Running
Started: Wed, 05 Oct 2016 07:11:04 +0000
Ready: True
Restart Count: 0
Volume Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-kir36 (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
weavedb:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
cni-bin:
Type: HostPath (bare host directory volume)
Path: /opt
cni-bin2:
Type: HostPath (bare host directory volume)
Path: /home
cni-conf:
Type: HostPath (bare host directory volume)
Path: /etc
default-token-kir36:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-kir36
QoS Class: Burstable
Tolerations: dedicated=master:Equal:NoSchedule
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1h 3m 19 {kubelet node-1} spec.containers{weave} Normal Pulling pulling image "weaveworks/weave-kube:1.7.0"
1h 3m 19 {kubelet node-1} spec.containers{weave} Normal Pulled Successfully pulled image "weaveworks/weave-kube:1.7.0"
55m 3m 11 {kubelet node-1} spec.containers{weave} Normal Created (events with common reason combined)
55m 3m 11 {kubelet node-1} spec.containers{weave} Normal Started (events with common reason combined)
1h 14s 328 {kubelet node-1} spec.containers{weave} Warning BackOff Back-off restarting failed docker container
1h 14s 300 {kubelet node-1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "weave" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=weave pod=weave-net-khebq_kube-system(d1feb9c1-8aca-11e6-8d4f-525400c583ad)"
Listing the containers running on node-1 gives
[vagrant@node-1 ~]$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
feef7e7436d2 weaveworks/weave-npc:1.7.0 "/usr/bin/weave-npc" About an hour ago Up About an hour k8s_weave-npc.e6299282_weave-net-khebq_kube-system_d1feb9c1-8aca-11e6-8d4f-525400c583ad_0f0517cf
762cd80d491e gcr.io/google_containers/pause-amd64:3.0 "/pause" About an hour ago Up About an hour k8s_POD.d8dbe16c_weave-net-khebq_kube-system_d1feb9c1-8aca-11e6-8d4f-525400c583ad_cda766ac
8c3395959d0e gcr.io/google_containers/kube-proxy-amd64:v1.4.0 "/usr/local/bin/kube-" About an hour ago Up About an hour k8s_kube-proxy.64a0bb96_kube-proxy-amd64-4d8s7_kube-system_909e6ae1-8aca-11e6-8d4f-525400c583ad_48e7eb9a
d0fbb716bbf3 gcr.io/google_containers/pause-amd64:3.0 "/pause" About an hour ago Up About an hour k8s_POD.d8dbe16c_kube-proxy-amd64-4d8s7_kube-system_909e6ae1-8aca-11e6-8d4f-525400c583ad_d6b232ea
The logs for the first container show some connection errors:
[vagrant@node-1 ~]$ sudo docker logs feef7e7436d2
E1005 08:46:06.368703 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:154: Failed to list *api.Pod: Get https://100.64.0.1:443/api/v1/pods?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:06.370119 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:155: Failed to list *extensions.NetworkPolicy: Get https://100.64.0.1:443/apis/extensions/v1beta1/networkpolicies?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:06.473779 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:153: Failed to list *api.Namespace: Get https://100.64.0.1:443/api/v1/namespaces?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:07.370451 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:154: Failed to list *api.Pod: Get https://100.64.0.1:443/api/v1/pods?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:07.371308 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:155: Failed to list *extensions.NetworkPolicy: Get https://100.64.0.1:443/apis/extensions/v1beta1/networkpolicies?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
E1005 08:46:07.474991 1 reflector.go:214] /home/awh/workspace/weave-npc/cmd/weave-npc/main.go:153: Failed to list *api.Namespace: Get https://100.64.0.1:443/api/v1/namespaces?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: connection refused
I lack the experience with kubernetes and container networking to troubleshoot these issues further, so some hints are very much appreciated. Observation: All pods/nodes report their IP as 10.0.2.15 which is the local Vagrant NAT address, not the actual IP address of the VMs.