I run helm upgrade --install
to modify the state of my kubernetes cluster and I sometimes get an error like this:
22:24:34 StdErr: E0126 17:24:28.472048 48084 portforward.go:178] lost connection to pod
22:24:34 Error: UPGRADE FAILED: transport is closing
It seems that I am not the only one, and it seems to happen with many different helm commands. All of these github issues have descriptions or comments mentioning "lost connection to pod" or "transport is closing" errors (usually both):
- https://github.com/kubernetes/helm/issues/1183
- https://github.com/kubernetes/helm/issues/2003
- https://github.com/kubernetes/helm/issues/2025
- https://github.com/kubernetes/helm/issues/2288
- https://github.com/kubernetes/helm/issues/2560
- https://github.com/kubernetes/helm/issues/3015
- https://github.com/kubernetes/helm/issues/3409
While it can be educational to read through hundreds of github issue comments, usually it's faster to cut to the chase on stackoverflow, and it didn't seem like this question existed yet, so here it is. Hopefully some quick symptom fixes and eventually one or more root cause diagnoses end up in the answers.
I was able to correct this by adding the tiller host information to the helm install command.
--host=10.111.221.14:443
You can get your tiller IP this way
$ kubectl get svc -n kube-system tiller-deploy
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
tiller-deploy ClusterIP 10.111.221.14 <none> 44134/TCP 34h
Full command example
helm install stable/grafana --name=grafana --host=10.111.221.14:4413
I know this is a bit of a work around but all other functions of helm are performing properly after installing via this method. I did not have to add the host information again after the initial install for performing upgrades or rollbacks. Hope this helps!
Memory limits were causing this error for me. The following fixed it:
kubectl set resources deployment tiller-deploy --limits=memory=200Mi
Deleting the tiller deployment and recreating it is only fix I've seen on github (here and here). This has been most helpful to people when the same helm command fails repeatedly (not with intermittent failures, though you could try it).
delete tiller (helm's server-side component):
kubectl delete deployment -n kube-system tiller-deploy
# deployment "tiller-deploy" deleted
and recreate it:
helm init --upgrade
# $HELM_HOME has been configured at /root/.helm.
# Tiller (the helm server side component) has been upgraded to the current version.
# Happy Helming!
Bouncing tiller obviously won't fix the root cause. There is hopefully a better answer than this forthcoming, maybe from https://github.com/kubernetes/helm/issues/2025. This is the only open github issue as of 13 Feb 2018.