Search in list data type in dynamo db aws

2019-01-28 09:44发布

问题:

We are using dynamo db as our database solution for one of our sites. we are storing data in dynamo db as per below given json.

We have video which can belong to one/many genres, so we have chosen list data type and have stored data into it and have made the genre as GSI (global secondary index)

I am facing several issues.

1) When I define genre as index, aws provides only three data types (string, binary, number) not allowing us to store a list type data. It gives an unexpected data type error.

2) If I do not define it as index I am not allowed to fetch the data. DynamoDB asks for hash key, which is not possible in my case as I am fetching a listing that should not depend on a hash key(primary key).

{
  "description": "********",
  "genre": [
    "Kids",
    "Documentary"
  ],
  "language": "******",
  "status": "0",
  "thumb_url": "******",
  "title": "******",
  "uploaded_by": "****** ",
  "url": "******",
  "video_id": 1330051052
}

Code to fetch data

$DynamoDbClient = AWS::get('DynamoDb');
        $result = $DynamoDbClient->query(array(
            'TableName' => 'videos',
            'IndexName' => 'genre-index',
            'AttributesToGet' => array('video_id', 'language', 'description'),
            'KeyConditions' => array(
                // Key attribute
                // This is non-key attribute
                'genre' => array(
                    'ComparisonOperator' => 'EQ',
                    'AttributeValueList' => array(
                        array("S" => "Kids"),
                    )
                ),
            ),
        ));

In the above code I am looking for videos in Kids genre. but it returns blank and gives error if I don't declare genre as index. Same video can belong to multiple genre.

So is there anyway that I can search inside a list OR am I not using API in a right way? Help is always appreciated.

回答1:

There is on thing about NoSQL is that it wont fit every where, but I had similar situation with my client, here is my solution:

videoMaster (videoId(hash), desc, link ..etc)
tagDetail (tagId(hash), videoId(Range))

Now you can query by passing tagId (kids, study..etc) you will get all the videos of particular tags

Your data in tagDetail will look something like:

kids -> video1
kids -> video2
Education -> video1
Education -> video3

Problem with the above solution: If you have billions of videos in one particular tag then your performance will be affected as the Hash is not distributed properly.

Small Tip: You can implement caching mechanism for your table reads so that you don't have to query your database every time.