Pagination with batch queries? Is it possible to b

2019-06-07 23:36发布

I am currently requesting 20 entries from the datastore, return these to the user with a cursor and in case the user is asking for more entries use the cursor as a new start and ask for the next 20 entries.

The code looks something like

q := datastore.NewQuery("Item").
    Limit(limit)

if cursor, err := datastore.DecodeCursor(cursor); err == nil {
    q = q.Start(cursor)
}

var is []Item
t := q.Run(c)
for {
    var i Item
    _, err := t.Next(&i)
    if err == datastore.Done {
        break
    }

    is = append(is, i)
}

In case it is important here is the complete code: https://github.com/koffeinsource/kaffeeshare/blob/master/data/appengine.go#L23

It looks an anti-pattern to use a loop with an append, but I don't see a way to get a cursor when using GetMulti/GetAll or am I missing something?

I do expect a data being added while users are querying the datastore, so an offset may produce duplicate results. Should I care about batching gets in this case?

1条回答
何必那么认真
2楼-- · 2019-06-08 00:34

Your approach is perfectly fine, in fact, it is the best way on AppEngine.

Querying subsequent entities by setting a start cursor will not give you duplicate results if a new record is inserted which would be the first for example.

Why? Because the cursor contains the key of the last returned entity encoded, and not the number of previously returned entities.

So if you set a cursor, the datastore will start listing and returning entities that come after the key encoded in the cursor. If a new entity is saved that comes after the cursor, then that entity will be returned when reached.

Also using the for and append() is the best way. You might optimize it a little by creating a big enough slice beforehand:

var is = make([]Item, 0, limit)

But note that I made it with 0 length and limit capacity on purpose: there is no guarantee that there will be enough entities to fill the full slice.

Another optimization would be to allocate it to be limit length:

var is = make([]Item, limit)

and when datastore.Done is reached, reslice it if it is not filled fully, for example:

for idx := 0; ; idx++ {
    var i Item
    _, err := t.Next(&i)
    if err == datastore.Done {
        if idx < len(is) {
            is = is[:idx] // Reslice as it is not filled fully
        }
        break
    }

    is[idx] = i
}

Batch operations

GetMulti, PutMulti and DeleteMulti are batch versions of the Get, Put and Delete functions. They take a []*Key instead of a *Key, and may return an appengine.MultiError when encountering partial failure.

Batch operations are not a replacement or alternative to queries. GetMulti for example requires you to already have all the keys prepared for which you want to get the complete entities. And as such, there is no sense of a cursor for these batch operations.

Batch operations return you all the requested information (or do all the requested operation). There is no sequence of entities or operations which would/could be terminated and continued later on.

Queries and batch operations are for different things. You shouldn't worry about query and cursor performance. They do quite good, and what's important, they (the Datastore) scale good. A cursor will not slow the execution of a query, a query with a cursor will run just as fast as a query without a cursor, and also previously returned entities do not affect query execution time: it doesn't matter if you run a query without a cursor or with a cursor which you acquired after getting a million entities (which is only possible with several iterations).

查看更多
登录 后发表回答