In GKE, I have a pod with two containers. They use the same image, and the only difference is that I am passing them slightly different flags. One runs fine, the other goes in a crash loop. How can I debug the reason for the failure?
My pod definition is
apiVersion: v1
kind: ReplicationController
metadata:
name: doorman-client
spec:
replicas: 10
selector:
app: doorman-client
template:
metadata:
name: doorman-client
labels:
app: doorman-client
spec:
containers:
- name: doorman-client-proportional
resources:
limits:
cpu: 10m
image: gcr.io/google.com/doorman/doorman-client:v0.1.1
command:
- client
- -port=80
- -count=50
- -initial_capacity=15
- -min_capacity=5
- -max_capacity=2000
- -increase_chance=0.1
- -decrease_chance=0.05
- -step=5
- -resource=proportional
- -addr=$(DOORMAN_SERVICE_HOST):$(DOORMAN_SERVICE_PORT_GRPC)
- -vmodule=doorman_client=2
- --logtostderr
ports:
- containerPort: 80
name: http
- name: doorman-client-fair
resources:
limits:
cpu: 10m
image: gcr.io/google.com/doorman/doorman-client:v0.1.1
command:
- client
- -port=80
- -count=50
- -initial_capacity=15
- -min_capacity=5
- -max_capacity=2000
- -increase_chance=0.1
- -decrease_chance=0.05
- -step=5
- -resource=fair
- -addr=$(DOORMAN_SERVICE_HOST):$(DOORMAN_SERVICE_PORT_GRPC)
- -vmodule=doorman_client=2
- --logtostderr
ports:
- containerPort: 80
name: http
kubectl describe
gives me the following:
6:06 [0] (szopa szopa-macbookpro):~/GOPATH/src/github.com/youtube/doorman$ kubectl describe pod doorman-client-tylba
Name: doorman-client-tylba
Namespace: default
Image(s): gcr.io/google.com/doorman/doorman-client:v0.1.1,gcr.io/google.com/doorman/doorman-client:v0.1.1
Node: gke-doorman-loadtest-d75f7d0f-node-k9g6/10.240.0.4
Start Time: Sun, 21 Feb 2016 16:05:42 +0100
Labels: app=doorman-client
Status: Running
Reason:
Message:
IP: 10.128.4.182
Replication Controllers: doorman-client (10/10 replicas created)
Containers:
doorman-client-proportional:
Container ID: docker://0bdcb8269c5d15a4f99ccc0b0ee04bf3e9fd0db9fd23e9c0661e06564e9105f7
Image: gcr.io/google.com/doorman/doorman-client:v0.1.1
Image ID: docker://a603248608898591c84216dd3172aaa7c335af66a57fe50fd37a42394d5631dc
QoS Tier:
cpu: Guaranteed
Limits:
cpu: 10m
Requests:
cpu: 10m
State: Running
Started: Sun, 21 Feb 2016 16:05:42 +0100
Ready: True
Restart Count: 0
Environment Variables:
doorman-client-fair:
Container ID: docker://92fea92f1307b943d0ea714441417d4186c5ac6a17798650952ea726d18dba68
Image: gcr.io/google.com/doorman/doorman-client:v0.1.1
Image ID: docker://a603248608898591c84216dd3172aaa7c335af66a57fe50fd37a42394d5631dc
QoS Tier:
cpu: Guaranteed
Limits:
cpu: 10m
Requests:
cpu: 10m
State: Running
Started: Sun, 21 Feb 2016 16:06:03 +0100
Last Termination State: Terminated
Reason: Error
Exit Code: 0
Started: Sun, 21 Feb 2016 16:05:43 +0100
Finished: Sun, 21 Feb 2016 16:05:44 +0100
Ready: False
Restart Count: 2
Environment Variables:
Conditions:
Type Status
Ready False
Volumes:
default-token-ihani:
Type: Secret (a secret that should populate this volume)
SecretName: default-token-ihani
Events:
FirstSeen LastSeen Count From SubobjectPath Reason Message
───────── ──────── ───── ──── ───────────── ────── ───────
29s 29s 1 {scheduler } Scheduled Successfully assigned doorman-client-tylba to gke-doorman-loadtest-d75f7d0f-node-k9g6
29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} implicitly required container POD Pulled Container image "gcr.io/google_containers/pause:0.8.0" already present on machine
29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} implicitly required container POD Created Created with docker id 5013851c67d9
29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} implicitly required container POD Started Started with docker id 5013851c67d9
29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-proportional} Created Created with docker id 0bdcb8269c5d
29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-proportional} Started Started with docker id 0bdcb8269c5d
29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Created Created with docker id ed0928176958
29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Started Started with docker id ed0928176958
28s 28s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Created Created with docker id 0a73290085b6
28s 28s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Started Started with docker id 0a73290085b6
18s 18s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Backoff Back-off restarting failed docker container
8s 8s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Started Started with docker id 92fea92f1307
29s 8s 4 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Pulled Container image "gcr.io/google.com/doorman/doorman-client:v0.1.1" already present on machine
8s 8s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Created Created with docker id 92fea92f1307
As you can see, the exit code is zero, with the message being "Error", which is not super helpful.
I tried:
changing the order of the definitions (firs one always runs, second one always fails).
changing the used ports to be different (no effect)
changing the name of the ports to be different (no effect).