How to setup error reporting in Stackdriver from k

2020-03-11 06:59发布

问题:

I'm a bit confused at how to setup error reporting in kubernetes, so errors are visible in Google Cloud Console / Stackdriver "Error Reporting"?

According to documentation https://cloud.google.com/error-reporting/docs/setting-up-on-compute-engine we need to enable fluentd' "forward input plugin" and then send exception data from our apps. I think this approach would have worked if we had setup fluentd ourselves, but it's already pre-installed on every node in a pod that just runs gcr.io/google_containers/fluentd-gcp docker image.

How do we enable forward input on those pods and make sure that http port available to every pod on the nodes? We also need to make sure this config is used by default when we add more nodes to our cluster.

Any help would be appreciated, may be I'm looking at all this from a wrong point?

回答1:

The basic idea is to start a separate pod that receives structured logs over TCP and forwards it to Cloud Logging, similar to a locally-running fluentd agent. See below for the steps I used.

(Unfortunately, the logging support that is built into Docker and Kubernetes cannot be used - it just forwards individual lines of text from stdout/stderr as separate log entries which prevents Error Reporting from seeing complete stack traces.)

Create a docker image for a fluentd forwarder using a Dockerfile as follows:

FROM gcr.io/google_containers/fluentd-gcp:1.18

COPY fluentd-forwarder.conf /etc/google-fluentd/google-fluentd.conf

Where fluentd-forwarder.conf contains the following:

<source>
  type forward
  port 24224
</source>

<match **>
  type google_cloud
  buffer_chunk_limit 2M
  buffer_queue_limit 24
  flush_interval 5s
  max_retry_wait 30
  disable_retry_limit
</match>

Then build and push the image:

$ docker build -t gcr.io/###your project id###/fluentd-forwarder:v1 .
$ gcloud docker push gcr.io/###your project id###/fluentd-forwarder:v1

You need a replication controller (fluentd-forwarder-controller.yaml):

apiVersion: v1
kind: ReplicationController
metadata:
  name: fluentd-forwarder
spec:
  replicas: 1
  template:
    metadata:
      name: fluentd-forwarder
      labels:
        app: fluentd-forwarder
    spec:
      containers:
      - name: fluentd-forwarder
        image: gcr.io/###your project id###/fluentd-forwarder:v1
        env:
        - name: FLUENTD_ARGS
          value: -qq
        ports:
        - containerPort: 24224

You also need a service (fluentd-forwarder-service.yaml):

apiVersion: v1
kind: Service
metadata:
  name: fluentd-forwarder
spec:
  selector:
    app: fluentd-forwarder
  ports:
  - protocol: TCP
    port: 24224

Then create the replication controller and service:

$ kubectl create -f fluentd-forwarder-controller.yaml
$ kubectl create -f fluentd-forwarder-service.yaml

Finally, in your application, instead of using 'localhost' and 24224 to connect to the fluentd agent as described on https://cloud.google.com/error-reporting/docs/setting-up-on-compute-engine, use the values of evironment variables FLUENTD_FORWARDER_SERVICE_HOST and FLUENTD_FORWARDER_SERVICE_PORT.



回答2:

To add to Boris' answer: As long as errors are logged in the right format (see https://cloud.google.com/error-reporting/docs/troubleshooting) and Cloud Logging is enabled (you can see the errors in https://console.cloud.google.com/logs/viewer) then errors will make it to Error Reporting without any further setup.



回答3:

Boris' answer was great but was a lot more complicated then it really needed to be (no need to build a docker image). If you have kubectl configured on your local box (or you can use the Google Cloud Shell), copy and paste the following and it will install the forwarder in your cluster (I updated the version of fluent-gcp from the above answer). My solution uses a ConfigMap to store the file so it can be changed easily without rebuilding.

cat << EOF | kubectl create -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-forwarder
data:
  google-fluentd.conf: |+
    <source>
      type forward
      port 24224
    </source>

    <match **>
      type google_cloud
      buffer_chunk_limit 2M
      buffer_queue_limit 24
      flush_interval 5s
      max_retry_wait 30
      disable_retry_limit
    </match>

---
apiVersion: v1
kind: ReplicationController
metadata:
  name: fluentd-forwarder
spec:
  replicas: 1
  template:
    metadata:
      name: fluentd-forwarder
      labels:
        app: fluentd-forwarder
    spec:
      containers:
      - name: fluentd-forwarder
        image: gcr.io/google_containers/fluentd-gcp:2.0.18
        env:
        - name: FLUENTD_ARGS
          value: -qq
        ports:
        - containerPort: 24224
        volumeMounts:
        - name: config-vol
          mountPath: /etc/google-fluentd
      volumes:
        - name: config-vol
          configMap:
            name: fluentd-forwarder
---
apiVersion: v1
kind: Service
metadata:
  name: fluentd-forwarder
spec:
  selector:
    app: fluentd-forwarder
  ports:
  - protocol: TCP
    port: 24224
EOF