python dynamodb get 1000 entries

2019-08-05 15:37发布

问题:

I am using amazon dynamodb and accessing it via the python boto query interface. I have a very simple requirement

  1. I want to get 1000 entries. But I don't know the primary keys beforehand. I just want to get 1000 entries. How can I do this? ...I know how to use the query_2 but that requires knowing primary keys beforehand.

  2. And maybe afterwards I want to get another different 1000 and go on like that. You can consider it as sampling without replacement.How can I do this?

Any help is much appreciated.

回答1:

Use Table.scan(max_page_size=1000)



回答2:

Get all the primary keys.

def get_all_primary_keys():
    ddb_client = boto3.client('dynamodb')
    primary_keys= []
    count = 0
    r = ddb_client.scan(
        TableName='your_TABLE',
        Select='SPECIFIC_ATTRIBUTES',
        AttributesToGet=[
            'your_primary_key',
        ],
    )
    count += r['Count']
    for i in r['Items']:
        primary_keys .append(i['your_primary_key']['S'])
    '''discards data after 1MB, hence the following code'''
    while True:
        try:
            r = ddb_client.scan(
                TableName='your_TABLE',
                Select='SPECIFIC_ATTRIBUTES',
                AttributesToGet=[
                    'your_primary_key',
                ],
                ExclusiveStartKey={
                    'your_primary_key': {
                        'S': r['LastEvaluatedKey']['your_primary_key']['S']
                    }
                }
            )
            count += r['Count']
            for i in r['Items']:
                primary_keys .append(i['your_primary_key']['S'])
        except KeyError as e:
            print e
            break
    return primary_keys