Why isn't my TakeLimit honored by TableQuery?

2019-03-20 14:40发布

问题:

I'd like to fetch top n rows from my Azure Table with a simple TableQuery. But with the code below, all rows are fetched regardless of my limit with the Take.

What am I doing wrong?

int entryLimit = 5;

var table = GetFromHelperFunc();

TableQuery<MyEntity> query = new TableQuery<MyEntity>()
    .Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "MyPK"))
    .Take(entryLimit);

List<FeedEntry> entryList = new List<FeedEntry>();
TableQuerySegment<FeedEntry> currentSegment = null;

while (currentSegment == null || currentSegment.ContinuationToken != null)
{
    currentSegment = table.ExecuteQuerySegmented(query, this.EntryResolver, currentSegment != null ? currentSegment.ContinuationToken : null);
    entryList.AddRange(currentSegment.Results);
}


Trace.WriteLine(entryList.Count) // <-- Why does this exceed my limit?

回答1:

The Take method on the storage SDK doesn't work like it would in LINQ. Imagine you do something like this:

TableQuery<TableEntity> query = new TableQuery<TableEntity>()
                .Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "temp"))
                .Take(5);
var result = table.ExecuteQuery(query);

When you start iterating over result you'll initially get only 5 items. But underneath, if you keep iterating over the result, the SDK will keep querying the table (and proceed to the next 'page' of 5 items).

If I have 5000 items in my table, this code will output all 5000 items (and underneath the SDK will do 1000 requests and fetch 5 items per request):

TableQuery<TableEntity> query = new TableQuery<TableEntity>()
                .Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "temp"))
                .Take(5);
var result = table.ExecuteQuery(query);
foreach (var item in result)
{
    Trace.WriteLine(item.RowKey);
}

The following code will fetch exactly 5 items in 1 request and stop there:

TableQuery<TableEntity> query = new TableQuery<TableEntity>()
                .Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "temp"))
                .Take(5);
var result = table.ExecuteQuery(query);
int index = 0;
foreach (var item in result)
{
    Console.WriteLine(item.RowKey);
    index++;
    if (index == 5)
        break;
}

Actually, the Take() method sets the page size or the "take count" (TakeCount property on TableQuery). But it's still up to you to stop iterating on time if you only want 5 records.

In your example, you should modify the while loop to stop when reaching the TakeCount (which you set by calling Take):

while (entryList.Count < query.TakeCount && (currentSegment == null || currentSegment.ContinuationToken != null))
{
    currentSegment = table.ExecuteQuerySegmented(query, currentSegment != null ? currentSegment.ContinuationToken : null);
    entryList.AddRange(currentSegment.Results);
}


回答2:

AFAIK Storage Client Library 2.0 had a bug in Take implementation. It was fixed in ver 2.0.4.
Read last comments at http://blogs.msdn.com/b/windowsazurestorage/archive/2012/11/06/windows-azure-storage-client-library-2-0-tables-deep-dive.aspx