Pods stuck in PodInitializing state indefinitely

2019-08-28 12:51发布

I've got a k8s cronjob that consists of an init container and a one pod container. If the init container fails, the Pod in the main container never gets started, and stays in "PodInitializing" indefinitely.

My intent is for the job to fail if the init container fails.

---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: job-name
  namespace: default
  labels:
    run: job-name
spec:
  schedule: "15 23 * * *"
  startingDeadlineSeconds: 60
  concurrencyPolicy: "Forbid"
  successfulJobsHistoryLimit: 30
  failedJobsHistoryLimit: 10
  jobTemplate:
    spec:
      # only try twice
      backoffLimit: 2
      activeDeadlineSeconds: 60
      template:
        spec:
          initContainers:
          - name: init-name
            image: init-image:1.0
          restartPolicy: Never
          containers:
          - name: some-name
            image: someimage:1.0
          restartPolicy: Never

a kubectl on the pod that's stuck results in:

Name:               job-name-1542237120-rgvzl
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               my-node-98afffbf-0psc/10.0.0.0
Start Time:         Wed, 14 Nov 2018 23:12:16 +0000
Labels:             controller-uid=ID
                    job-name=job-name-1542237120
Annotations:        kubernetes.io/limit-ranger:
                      LimitRanger plugin set: cpu request for container elasticsearch-metrics; cpu request for init container elasticsearch-repo-setup; cpu requ...
Status:             Failed
IP:                 10.0.0.0
Controlled By:      Job/job-1542237120
Init Containers:
init-container-name:
    Container ID:  docker://ID
    Image:         init-image:1.0
    Image ID:      init-imageID
    Port:          <none>
    Host Port:     <none>
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 14 Nov 2018 23:12:21 +0000
      Finished:     Wed, 14 Nov 2018 23:12:32 +0000
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:        100m
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-wwl5n (ro)
Containers:
  some-name:
    Container ID:  
    Image:         someimage:1.0
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:        100m
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-wwl5n (ro)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True

3条回答
Summer. ? 凉城
2楼-- · 2019-08-28 13:15

To try and figure this out I would run the command:

kubectl get pods - Add the namespace param if required.

Then copy the pod name and run:

kubectl describe pod {POD_NAME}

That should give you some information as to why it's stuck in the initializing state.

查看更多
SAY GOODBYE
3楼-- · 2019-08-28 13:19

I think that you could miss that it is the expected behavior of the init containers. The rule is that in case of initContainers failure a Pod will not restart if restartPolicy is set to Never otherwise the Kubernetes will keep restarting it until it succeeds.

Also:

If the init container fails, the Pod in the main container never gets started, and stays in "PodInitializing" indefinitely.

According to documentation:

A Pod cannot be Ready until all Init Containers have succeeded. The ports on an Init Container are not aggregated under a service. A Pod that is initializing is in the Pending state but should have a condition Initializing set to true.

*I can see that you tried to change this behavior, but I am not sure if you can do that with CronJob, I saw examples with Jobs. But I am just theorizing, and if this post did not help you solve your issue I can try to recreate it in lab environment.

查看更多
我欲成王,谁敢阻挡
4楼-- · 2019-08-28 13:19

Since you have already figured out that initcontainers are meant to run to completion, successfully. If you can't get rid of init containers, what i would do in this case is to make sure that the init container ends successfully all the time. The result of the init container can be written in an emptydir volume, something like a status file, shared by both your init container and your work container. I would delegate to the work container the responsibility of deciding what to do in case the init container ends unsuccessfully.

查看更多
登录 后发表回答