Kubernetes deployment cannot mount volume despite

2019-09-11 07:04发布

问题:

I have a Kubernetes deployment where a pod should mount a PD.

Under spec.template.spec.containers.[*] I have this:

   volumeMounts:
    - name: app-volume
      mountPath: /mnt/disk/app-pd

and under spec.template.spec that:

 volumes:
  - name: app-volume
    gcePersistentDisk:
      pdName: app-pd
      fsType: ext4

app-pd is a GCE persistent disk with a single ext4 file system (hence no partitions) on it. If I run kubectl create I get these error messages from kubectl describe pod:

Warning FailedMount Unable to mount volumes for pod "<id>": 
  timeout expired waiting for volumes to attach/mount for pod"<id>"/"default". 
  list of unattached/unmounted volumes=[app-volume]
Warning FailedSync Error syncing pod, skipping: 
  timeout expired waiting for volumes to attach/mount for pod "<id>"/"default". 
  list of unattached/unmounted volumes=[app-volume]

On the VM instance that runs the pod, /var/log/kubelet.log contain repetitions of these error messages, which are presumably related to or even causing the above:

reconciler.go:179] 
  VerifyControllerAttachedVolume operation started for volume "kubernetes.io/gce-pd/<id>"
  (spec.Name: "<id>") pod "<id>" (UID: "<id>")
goroutinemap.go:155] 
  Operation for "kubernetes.io/gce-pd/<id>" failed. 
  No retries permitted until <date> (durationBeforeRetry 2m0s). 
  error: Volume "kubernetes.io/gce-pd/<id>" (spec.Name: "<id>") pod "<id>" (UID: "<id>") 
    is not yet attached according to node status.

However, if I try to attach the PD to the VM instance which runs the pod with gcloud compute instances attach-disk and the gcloud compute ssh into it, I can see that these the following file have been created.

/dev/disk/by-id/google-persistent-disk-1

If I mount it (the PD) I can see and work with the expected files.

How can I further diagnose this problem and ultimately resolve it?

Could the problem be that the file is called /dev/disk/google-persistent-disk-1 instead of /dev/disk/google-<id> as would happen if I would have mounted them from the Cloud Console UI?

UPDATE I've simplified the setup by formatting the disk with a single ext4 file system (hence no partitions) and edited the description above accordingly. I've also added more specific error indications from kubelet.log.

UPDATE The problem also remains if I manually add the PD (in the Cloud Console UI) before deployment to the instance VM that will host the pod. Both the PD and the instance VM are in the same zone.

UPDATE The observed difference in block device names for the same persistent disk is normal according to GCE #211.

回答1:

I don't know why (yet) but deleting and then recreating the GKE cluster before deployment apparently solved the issue.