I have 4 nodes kubernetes cluster. My application running with 2 replica instances. I am using deployment resource with replica set. As per the documentation , replica set always ensure that specified no. Of application instances will be running.If I delete the one pod instance, then it will be restarted on the same or different node.But when I simulated the failure of a pod instance by stopping docker engine on one node. Kubectl shows status as error for the pod instance but do not restart the pod on another node. Is it the expected behaviour or am I missing something.
标签:
kubernetes
相关问题
- Microk8s, MetalLB, ingress-nginx - How to route ex
- How do I change the storage class of existing pers
- Use awslogs with kubernetes 'natively'
- Kubernetes coredns readiness probe failed
- Default certificate on Nginx-ingress
相关文章
- k8s 访问Pod 时好时坏
- Override env values defined in container spec
- How do I create a persistent volume claim with Rea
- How to obtain the enable admission controller list
- Difference between API versions v2beta1 and v2beta
- MountVolume.SetUp failed for volume “nfs” : mount
- How to save content of a configmap to a file with
- GKE does not scale to/from 0 when autoscaling enab
Just wait for about 5 mins of bringing down the node or docker on it. Kubernetes marks the status of all the pods which were running on that node as 'Unknown' and will bring them up on the remaining active eligible nodes. Once the failed node comes back up, the pods on that node would be deleted if K8S already has them replaced on other node(s).
AFAIS Kubernetes changed that behavior with version 1.5. If I interpret the docs correctly, the Pods of the failed node is still registered in the apiserver, since it abruptly died and wasn't able to unregister the pods. Since the Pod is still registered, the ReplicaSet doesn't replace it.
The reason for this is, that Kubernetes cannot tell if it is a network error (eg split-brain) or a node failure. With StatefulSets being introduced, Kubernetes needs to make sure that no Pod is started more than one time.
This maybe sounds like a bug, but if you have a properly configured cloud-provider (eg for GCE or AWS), Kubernetes can see if that Node is still running. When you would shut down that node, the controller should unregister the Node and its Pods and then create a new Pod on another Node. Together with a Node health check and a Node replacement, the cluster is able to heal itself.
How the cloud-provider is configured depends highly on your Kubernetes setup.