Google Datastore - What happens when you exceed th

2019-07-30 00:29发布

I'm trying to create about 100,000 new entities (representing users) that have the same parent. I read that there is a limit of one entity write per second per entity group. I thought the request may time out so I decided to use a Push Queue Task to extend the time I had to ten minutes. I tried using put() in a for loop in a Push Queue Task, but I ended up timing out still (only got to write about 8,900 entities).

I'm confused as to why I didn't get an error since I tried to do multiple writes to the same group. The task timed out at 10 minutes so that means I got 890 writes per minute, which is about 14 writes per second. This is way over one write per second.. I read the answers for Google App Engine HRD - what if I exceed the 1 write per second limit for writing to the entity group? and Google Datastore - Not Seeing 1 Write per Second per Entity Group Limitation, but to my understanding it just says that it's possible for the Datastore to write 5-10 entities per second. The rate I got was higher than that though.

I also read here that

Datastore contention occurs when a single entity or entity group is updated too rapidly. The datastore will queue concurrent requests to wait their turn. Requests waiting in the queue past the timeout period will throw a concurrency exception

Does this mean that an error won't be thrown for trying to exceed 1 write/sec? The writes will just get placed in a queue and I'll only get an error when the timeout for the request occurs (in this case 10 minutes for the Task Queue)?

1条回答
Animai°情兽
2楼-- · 2019-07-30 00:47
  1. You are bumping into 10 minute limitation on tasks that run on instances with automatic scaling. You can split your 100,000 users into smaller batches and process each batch in a separate task.

  2. You can use batch calls to the datastore saving up to 500 entities in a single call, which is much faster than saving each entity individually.

  3. There are absolutely no reasons to have all users in the same entity group. This data model will have negative performance implications - the write limit is there for a reason. Entity groups are designed for something like a user with 3 addresses or 10 photo albums, and even then I almost always avoid parent-child relationships as they rarely add any value, but make the code more complex (you always have to know the parent to retrieve or save an entity).

查看更多
登录 后发表回答