How to do paging with simpledb?

2019-01-18 12:35发布

I know how to page forward with SimpleDB data by using NextToken. However, how exactly does one handle previous pages? I'm on .NET, but I don't think that matters. I'm more interested in the general strategy.

Mike Culver's An Introduction to Amazon SimpleDB webinar mentions that breadcrumbs are used, but he doesn't implement them in the video.

EDIT: The video mentions a sample project which implements backwards paging, but the video ends before the URL for the download can be displayed. The one sample project I found didn't deal with paging.

4条回答
ゆ 、 Hurt°
2楼-- · 2019-01-18 13:10

When going to the next page you may be able to simplify the use case by only allowing a "next page" and not arbitrary paging. You can do this in SimpleDB by using the LIMIT clause:

SELECT title, summary, votecount FROM posts WHERE userid = '000022656' LIMIT 25

You already know how to handle the NextToken, but if you use this tactic, you can support "previous page" by storing the breadcrumb trail of next tokens (e.g. in the web session) and re-issuing the query with a previous NextToken rather than a subsequent one.

However, the general case for handling arbitrary pagination in SimpleDB is the same for previous and next. In the general case, the user may click on an arbitrary page number, like 5, without ever having visited page 4 or 6.

You handle this in SimpleDB by using the fact that NextToken only requires the WHERE clause to be the same to work properly. So rather than querying through every page in sequence pulling down all the intervening items, you can usually do it in two steps.

  1. Issue your query with a limit value of where the desired page should start, and SELECT count(*) instead of the actual attributes you want.
  2. Use the NextToken from step one to fetch the actual page data using the desired attributes and the page size as the LIMIT

So in pseudo code:

int targetPage, pageSize;
...
int jumpLimit = pageSize * (targetPage - 1);
String query = "SELECT %1 FROM posts WHERE userid = '000022656' LIMIT %2";
String output = "title, summary, votecount";
Result temp = sdb.select(query, "count(*)", jumpLimit);
Result data = sdb.select(query, output, pageSize, temp.getToken());

Where %1 and %2 are String substitutions and "sdb.select()" is a fictitious method that includes the String substitution code along with the SimpleDB call.

Whether or not you can accomplish this in two calls to SimpleDB (as shown in the code) will depend on the complexity of your WHERE clause and the size of your data set. The above code is simplified in that the temp result may have returned a partial count if the query took more than 5 seconds to run. You would really want to put that line in a loop until the proper count is reached. To make the code a little more realistic I'll put it within methods and get rid of the String substitutions:

private Result fetchPage(String query, int targetPage)
{
    int pageSize = extractLimitValue(query);
    int skipLimit = pageSize * (targetPage - 1);
    String token = skipAhead(query, skipLimit);
    return sdb.select(query, token);
}

private String skipAhead(String query, int skipLimit)
{
    String tempQuery = replaceClause(query, "SELECT", "count(*)");
    int accumulatedCount = 0;
    String token = "";
    do {
        int tempLimit = skipLimit - accumulatedCount;
        tempQuery = replaceClause(tempQuery , "LIMIT", tempLimit + "");
        Result tempResult = sdb.select(query, token);
        token = tempResult.getToken();
        accumulatedCount += tempResult.getCount();
    } while (accumulatedCount < skipLimit);
    return token;
}

private int extractLimitValue(String query) {...}
private String replaceClause(String query, String clause, String value){...}

This is the general idea without error handling, and works for any arbitrary page, excluding page 1.

查看更多
别忘想泡老子
3楼-- · 2019-01-18 13:15

I recall that in one of the brown bag webinars, it was mentioned in passing that the tokens could be resubmitted and you'd get the corresponding result set back.

I haven't tried it, and it is just an idea, but how about building a list of the tokens as you are paging forward? To go back, then, just traverse the list backwards and resubmit the token (and select statement).

查看更多
孤傲高冷的网名
4楼-- · 2019-01-18 13:16

i'm stuck at getting the token - is that the same thing as RequestId?

The PHP SimpleDB library that i'm using doesn't seem to return it. http://sourceforge.net/projects/php-sdb/

Found this documentation http://docs.amazonwebservices.com/AmazonSimpleDB/2009-04-15/DeveloperGuide/index.html?SDB_API_Select.html

which seems to indicate that there is a nextToken element, but in the sample response, it shows RequestId...

Figured it out - our PHP lib was indeed abstracting the nexttoken away from where we had access to it. Dug into the library and found it.

查看更多
淡お忘
5楼-- · 2019-01-18 13:21

I have created a Java version of the sampling proposed above with the official SimpleDB API, maybe this is useful for anybody.

private static Set<String> getSdbAttributes(AmazonSimpleDBClient client,
            String domainName, int sampleSize) {
        if (!client.listDomains().getDomainNames().contains(domainName)) {
        throw new IllegalArgumentException("SimpleDB domain '" + domainName
                + "' not accessible from given client instance");
    }

    int domainCount = client.domainMetadata(
            new DomainMetadataRequest(domainName)).getItemCount();
    if (domainCount < sampleSize) {
        throw new IllegalArgumentException("SimpleDB domain '" + domainName
                + "' does not have enough entries for accurate sampling.");
    }

    int avgSkipCount = domainCount / sampleSize;
    int processedCount = 0;
    String nextToken = null;
    Set<String> attributeNames = new HashSet<String>();
    Random r = new Random();
    do {
        int nextSkipCount = r.nextInt(avgSkipCount * 2) + 1;

        SelectResult countResponse = client.select(new SelectRequest(
                "select count(*) from `" + domainName + "` limit "
                        + nextSkipCount).withNextToken(nextToken));

        nextToken = countResponse.getNextToken();

        processedCount += Integer.parseInt(countResponse.getItems().get(0)
                .getAttributes().get(0).getValue());

        SelectResult getResponse = client.select(new SelectRequest(
                "select * from `" + domainName + "` limit 1")
                .withNextToken(nextToken));

        nextToken = getResponse.getNextToken();

        processedCount++;

        if (getResponse.getItems().size() > 0) {
            for (Attribute a : getResponse.getItems().get(0)
                    .getAttributes()) {
                attributeNames.add(a.getName());
            }
        }
    } while (domainCount > processedCount);
    return attributeNames;
}
查看更多
登录 后发表回答