We have 4 instances in google app engine and only one instance is handling most of the requests. How can we scale such that all the instances can handle equal number of requests?
相关问题
- java.lang.NullPointerException at java.io.PrintWri
- __call__() missing 1 required positional argument:
- Upload file to Google Cloud Storage using AngularJ
- Where is the best place to put one-time and every-
- facebook “could not retrieve data from URL”
相关文章
- Is there a size limit for HTTP response headers on
- appcfg.py command not found
- Google app engine datastore string encoding proble
- Angular route not working when used with Google Ap
- Doctrine not finding data on Google App Engine?
- Using OkHttp client via OKClient on Google App Eng
- Google appEngine: 404 when accessing /_ah/api [dup
-
Google App Engine Error:
INVALID_ARGUMENT
I would also ask the question is if you're using resident instances vs dynamic instances.
For example if you have configured scaling overrides within your application yaml file you may see some instances just "sitting there". Resident instances can handle peak / overflow traffic and are always on but may not ultimately always be serving traffic.
EG:
Evenly balancing the load across the running instances doesn't actually mean scaling. As long as one instance is capable of handling the incoming requests with acceptable performance you're not looking at a scaling issue.
If you're using automatic or basic scaling (which you should, if you're concerned with scalability) the uneven load spread across the running instances can actually be essential for controlling the automatic instance on-demand spinup (when load exceeds a certain threshold) and shutdown (when instances are idling).
For example if a load that could be easily handled by 1-2 instances would be evenly distributed across 4 running instances then none of the 4 instances would be idle long enough to be shut down.
Having a single instance as the "preferred" one to run traffic on and the others just picking op "overflowing"/peak load makes the algorithm for controlling instance spinup/shutdown a lot simpler (and I think more precise as well) - the threshold comparison logic only needs to be applied on one (or just a few) running instances, not on all of them.