So I've been using app engine for quite some time now with no issues. I'm aware that if the app hasn't been hit by a visitor for a while then the instance will shut down, and the first visitor to hit the site will have a few second delay while a new instance fires up.
However, recently it seems that the instances only stay alive for a very short period of time (sometimes less than a minute), and if I have 1 instance already up and running, and I refresh an app webpage, it still fires up another instance (and the page it starts is minimal homepage HTML, shouldn't require much CPU/memory). Looking at my logs its constantly starting up new instances, which was never the case previously.
Any tips on what I should be looking at, or any ideas of why this is happening?
Also, I'm using Python 2.7, threadsafe, python_precompiled, warmup inbound services, NDB.
Update:
So I changed my app to have at least 1 idle instance, hoping that this would solve the problem, but it is still firing up new instances even though one resident instance is already running. So when there is just the 1 resident instance (and I'm not getting any traffic except me), and I go to another page on my app, it is still starting up a new instance.
Additionally, I changed the Pending Latency to 1.5s as koma pointed out, but that doesn't seem to be helping.
The memory usage of the instances is always around 53MB, which is surprising when the pages being called aren't doing much. I'm using the F1 Frontend Instance Class and that has a limit of 128, but either way 53MB seems high for what it should be doing. Is that an acceptable size when it first starts up?
Update 2: I just noticed in the dashboard that in the last 14 hours, the request /_ah/warmup responded with 24 404 errors. Could this be related? Why would they be responding with a 404 response status?
Main question: Why would it constantly be starting up new instances (even with no traffic)? Especially where there are already existing instances, and why do they shut down so quickly?
My solution to this was to increase the Pending Latency time.
If a webpage fires 3 ajax requests at once, AppEngine was launching new instances for the additional requests. After configuring the Minimum Pending Latency time - setting it to 2.5 secs, the same instance was processing all three requests and throughput was acceptable.
My project still has little load/traffic... so in addition to raising the Pending Latency, I openend an account at Pingdom and configured it to ping my Appengine project every minute.
The combination of both, makes that I have one instance that stays alive and is serving up all requests most of the time. It will scale to new instances when really necessary.
I only started having this type of issue on Monday February 4 around 10 pm EST, and is continuing until now. I first started noticing then that instances kept firing up and shutting down, and latency increased dramatically. It seemed that the instance scheduler was turning off idle instances too rapidly, and causing subsequent thrashing.
I set minimum idle instances to 1 to stabilize latency, which worked. However, there is still thrashing of new instances. I tried the recommendations in this thread to only set minimum pending latency, but that does not help. Ultimately, idle instances are being turned off too quickly. Then when they're needed, the latency shoots up while trying to fire up new instances.
I'm not sure why you saw this a couple weeks ago, and it only started for me a couple days ago. Maybe they phased in their new instance scheduler to customers gradually? Are you not still seeing instances shutting down quickly?
1 idle instance means that app-engine will always fire up an extra instance for the next user that comes along - that's why you are seeing an extra instance fired up with that setting.
If you remove the idle instance setting (or use the default) and just increase pending latency it should "wait" before firing the extra instance.
With regards to the main question I think @koma might be onto something in saying that with default settings app-engine will tend to fire extra instances even if the requests are coming from the same session.
In my experience app-engine is great under heavy traffic but difficult (and sometimes frustrating) to work with under low traffic conditions. In particular it is very difficult to figure out the nuances of what the criteria for firing up new instances actually are.
Personally, I have a "wake-up" cron-job to bring up an instance every couple of minutes to make sure that if someone comes on the site an instance is ready to serve. This is not ideal because it will eat at my quote, but it works most of the time because traffic on my app is reasonably high.