I am using amazon dynamodb and accessing it via the python boto query interface. I have a very simple requirement
I want to get 1000 entries. But I don't know the primary keys beforehand. I just want to get 1000 entries. How can I do this? ...I know how to use the query_2 but that requires knowing primary keys beforehand.
And maybe afterwards I want to get another different 1000 and go on like that. You can consider it as sampling without replacement.How can I do this?
Any help is much appreciated.
Use Table.scan(max_page_size=1000)
Get all the primary keys.
def get_all_primary_keys():
ddb_client = boto3.client('dynamodb')
primary_keys= []
count = 0
r = ddb_client.scan(
TableName='your_TABLE',
Select='SPECIFIC_ATTRIBUTES',
AttributesToGet=[
'your_primary_key',
],
)
count += r['Count']
for i in r['Items']:
primary_keys .append(i['your_primary_key']['S'])
'''discards data after 1MB, hence the following code'''
while True:
try:
r = ddb_client.scan(
TableName='your_TABLE',
Select='SPECIFIC_ATTRIBUTES',
AttributesToGet=[
'your_primary_key',
],
ExclusiveStartKey={
'your_primary_key': {
'S': r['LastEvaluatedKey']['your_primary_key']['S']
}
}
)
count += r['Count']
for i in r['Items']:
primary_keys .append(i['your_primary_key']['S'])
except KeyError as e:
print e
break
return primary_keys