How to get a custom healthcheck path in a GCE L7 b

2019-04-22 20:44发布

问题:

I'm trying to deploy a grafana instance inside Kubernetes (server 1.6.4) in GCE. I'm using the following manifests:

Deployment (full version):

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: grafana
spec:
  replicas: 1
  template:
    metadata:
      labels:
        name: grafana
    spec:
      initContainers:
        …                                
      containers:
        - name: grafana
          image: grafana/grafana
          readinessProbe:
            httpGet:
              path: /login
              port: 3000
          …

Service:

apiVersion: v1
kind: Service
metadata:
  name: grafana
spec:
  selector:
    name: grafana
  ports:
    - protocol: TCP
      port: 3000
  type: NodePort

Ingress:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: grafana
spec:
  tls:
    - secretName: grafana.example.com
  backend:
    serviceName: grafana
    servicePort: 3000

It turns out that grafana serves a 302 on / but the default GCE ingress healthcheck expects a 200 on / (source). As you can see, there is a custom readinessProbe in the Deployment (line 22).

Once I post these resources to the kube-apiserver, everything is created without error. Concretely, the Ingress gets a public ip4 address but the healtcheck is set up with the default / path instead of the custom one configured in the readinessProbe. Because of this, I get a 502 if I curl the public ip4 address of the Ingress.

The problem is fixable by manually changing the probe path to /login in the GCE console.

回答1:

Quoting from here:

The GLBC requires that you define the port (in your case 3000) within the Pod specification.

The solution is to declare the port used for the healthcheck in ports besides adding a custom readinessProbe:

containers:
  - name: grafana
    readinessProbe:
      httpGet:
        path: /login
        port: 3000
    ports:
      - name: grafana
        containerPort: 3000
    …


回答2:

Customizing Health Checks

With GLBC addon

It is not quite clear from your question, but if you're using the GCE Load-Balancer Controller (GLBC) Cluster Addon, you can customize the health check path.

Currently, all service backends must satisfy either of the following requirements to pass the HTTP(S) health checks sent to it from the GCE loadbalancer:

  • Respond with a 200 on '/'. The content does not matter.
  • Expose an arbitrary url as a readiness probe on the pods backing the Service.

The Ingress controller looks for a compatible readiness probe first, if it finds one, it adopts it as the GCE loadbalancer's HTTP(S) health check. If there's no readiness probe, or the readiness probe requires special HTTP headers, the Ingress controller points the GCE loadbalancer's HTTP health check at '/'. This is an example of an Ingress that adopts the readiness probe from the endpoints as its health check.

The GLBC addon page mentions this in the Limitations section:

All Kubernetes services must serve a 200 page on '/', or whatever custom value you've specified through GLBC's --health-check-path argument.

Without GLBC addon

If you're not using the addon, currently Kubernetes does require you to serve 200 for GET requests on / path for successful health checks otherwise the backend will not get any traffic.

There is a bit of background about this in this bug.

Google Container Engine (GKE)

If you're on Google Container Engine (GKE), the same default Kubernetes requirements for health checks apply there too.

Services exposed through an Ingress must serve a response with HTTP 200 status to the GET requests on / path. This is used for health checking. If your application does not serve HTTP 200 on /, the backend will be marked unhealthy and will not get traffic.

Answer to your real issue

Having said all of this, as you (@mmoya) point out in your answer, adding the same port that is used for readiness probe as one of the ports in the pod fixes the issue for your case since the port itself is not exposed outside the pod otherwise. This caused Kubernetes to rely on the health check from / instead.