GAE: What's the difference between and <

2019-04-29 01:14发布

As far as I can read the docs, both settings do the same thing: start a new instance when a request has spent in pending queue longer than that setting says.

<max-pending-latency> The maximum amount of time that App Engine should allow a request to wait in the pending queue before starting a new instance to handle it. Default: "30ms".

  • A low maximum means App Engine will start new instances sooner for pending requests, improving performance but raising running costs.
  • A high maximum means users might wait longer for their requests to be served, if there are pending requests and no idle instances to serve them, but your application will cost less to run.

<min-pending-latency> The minimum amount of time that App Engine should allow a request to wait in the pending queue before starting a new instance to handle it.

  • A low minimum means requests must spend less time in the pending queue when all existing instances are active. This improves performance but increases the cost of running your application.
  • A high minimum means requests will remain pending longer if all existing instances are active. This lowers running costs but increases the time users must wait for their requests to be served.

Source: https://cloud.google.com/appengine/docs/java/config/appref

What's the difference between min and max then?

1条回答
别忘想泡老子
2楼-- · 2019-04-29 02:07

The piece of information you might be missing to understand these settings is that App Engine can choose to create an instance at any time between min-pending-latency and max-pending-latency.

This means an instance will never be created to serve a pending request before min-pending-latency and will always be created once max-pending-latency has been reached.

I believe the best way to understand is to look at the the timeline of events when a request enters the pending queue:

  1. A request reaches the application but no instance are available to serve it so it is placed in the pending requests queue.
  2. Until the min-pending-latency is reached: App Engine tries to find an available instance to serve the request and will not create a new instance.
  3. After the min-pending-latency is reached and until max-pending-latency is reached: App Engine tries to find an available instance to serve the request and can choose to create a new instance.
  4. After the max-pending-latency is reached: App Engine stops searching for an available instance to serve the request and creates a new instance.
查看更多
登录 后发表回答