Google App Engine health checks spamming app

2019-01-11 06:28发布

问题:

I've deployed a nodejs app running on the Google App Engine Flex runtime using the following app.yaml configuration:

runtime: nodejs
env: flex
health_check:
  enable_health_check: True
  check_interval_sec: 20
  timeout_sec: 4
  unhealthy_threshold: 2
  healthy_threshold: 2

According to the health check documentation the health checks should hit the /_ah/health endpoint every 20 seconds. However I noticed that my app is getting spammed with these health checks multiple times per second, even though the app responds with 200 status code:

Any idea why this is happening?

回答1:

Unfortunately it does seem like we have a bug on our docs. Today, indeed, apps do get health checked on a pretty frequent basis.

The reason is many fold, but in general each VM will be hit by 3 * 2 different health checks at the recurrence interval you specify (by default, the, very aggressive, 1 sec). The reason for this is 2 types of health check (autohealer and LB ones) and 3 of each for availability reasons.

That being said, we are currently working on a new shape of health checks that will be released pretty soon and should fix this and other problems with the existing health checking behavior (at least make the defaults more manageable and giving more tuning options to users).

Stay tuned!



回答2:

I don't have a solution to the root problem. But if the spamming is making it impossible to use the log for its intended purpose, like it is for me, here is a work around:

  1. Enable the 'Advanced Log Filters' (the tiny down arrow next to the search field in Stackdriver Logging)

  2. Add this to the Search query

    NOT textPayload : (health)



回答3:

I also run NodeJS in GAE Flex env. Health checks were also spamming the server log. The following few things helped me in reducing them:

  1. Although the google documentation (https://cloud.google.com/appengine/docs/flexible/nodejs/configuring-your-app-with-app-yaml#health_checks) says healthcheck config is not required, I set them explicity anyway to lower the frequency of the health check calls.
  2. Use the "Advanced Log Filter" to remove the health check logging from showing up if they are too distracting.
  3. Google documentation (https://cloud.google.com/appengine/docs/flexible/nodejs/how-instances-are-managed) says it's not required to implement a handler for health check, I explicitly implemented it anyway. I added a handler for "/_ah/healthcheck" endpoint in the express.js server, and have the route at the top of app.js file, so the healthcheck requests are responded right away. This helped reduce some noises caused by the health check requests getting into the express app logic.


回答4:

Use the advanced filter and say "NOT _ah/health".

Removing nginx.request log will help as well.