Set operations in Appengine datastore

2019-05-15 02:33发布

问题:

I assume there's no good way to do so, but if you had to, how would you implement set operations on Appengine's datastore?

For example given two collections of integers, how would you store them in the datastore to get a good performance for intersect and except (all those items in A that are not in B) operations?

回答1:

There aren't any built in set operations in the datastore API. I see you having two options:

  1. For smallish sets (hundreds of items) You might get away with doing keys-only queries for both set A and set B, and doing the intersection in your application code. The precise definition of "smallish" will depend on your application.

  2. For largish sets (millions of items) If you know ahead of time which intersections you'll want, you can calculate them each time you insert a new record. For example, assume you have two sets A and B, and you know that you'll eventually want to query on (A intersects B). whenever you insert an A, check to see if it is already in B. If it is, record this fact somewhere (either in a separate entity type, or as a boolean property on A or B itself). Of course, you'll need to do this for your B's also.

With option 1, you can have lots of different sets, but are limited by how big each set is.

With option 2, you can have sets with millions of members, but if you have more than a few sets, trying to define all the possible permutations of sets and operators will get unwieldy.