Using less datastore small operations in appengine

2020-02-29 23:03发布

I'm putting together a basic photoalbum on appengine using python 27. I have written the following method to retrieve image details from the datastore matching a particular "adventure". I'm using limits and offsets for pagination, however it is very inefficient. After browsing 5 pages (of 5 photos per page) I've already used 16% of my Datastore Small Operations. Interestingly I've only used 1% of my datastore read operations. How can I make this more efficient for datastore small operations - I'm not sure what these consist of.

def grab_images(adventure, the_offset=0, the_limit = 10):
    logging.info("grab_images")
    the_photos = None
    the_photos = PhotosModel.all().filter("adventure =", adventure)
    total_number_of_photos = the_photos.count()
    all_photos = the_photos.fetch(limit = the_limit, offset = the_offset)
    total_number_of_pages = total_number_of_photos / the_limit
    all_photo_keys = []
    for photo in all_photos:
        all_photo_keys.append(str(photo.blob_key.key()))
    return all_photo_keys, total_number_of_photos, total_number_of_pages

3条回答
淡お忘
2楼-- · 2020-02-29 23:21

The way you handle paging is inefficient as it goes through every record before the offset to deliver the data. You should consider building the paging mechanisms using the bookmark methods described by Google http://code.google.com/appengine/articles/paging.html.

Using this method you only go through the items you need for each page. I also urge you to cache properly as suggested by Shay, it's both faster and cheaper.

查看更多
Explosion°爆炸
3楼-- · 2020-02-29 23:35

You may want to consider moving to the new NDB API. Its use of futures, caches and autobatching may help you a lot. Explicit is better than implicit, but NDB's management of the details makes your code simpler and more readable.

BTW, did you try to use appstats and see how your requests are using the server resources?

查看更多
劳资没心,怎么记你
4楼-- · 2020-02-29 23:41

A few things:

  1. You don't need to have count called each time, you can cache it
  2. Same goes to the query, why are you querying all the time? cache it also.
  3. Cache the pages also, you should not calc the data per page each time.
  4. You only need the blob_key but your are loading the entire photo entity, try to model it in a way that you won't need to load all the Photo atributes.

nitpicking: you don't need the_photos = None

查看更多
登录 后发表回答