I am trying to figure out a way to efficiently store and retrieve the popularity of articles by category over time in redis. For now I am thinking of a solution like this.
Create a bunch of hashes to track the popularity of an article across all categories, where the key is 'all', year, month or begining of week and field is the article id with value being the counter. To update the popularity of an article I'll use the HINCRBY to increment the counter for that article
Hashes for overall popularity:
all: article_id <counter> // all time popular
2012: article_id <counter> // popular by year
2012-01: article_id <counter> // popular by month
2012-01-04: article_id <counter> // popular by week, where the date is the beginning of the week
And create set of Hashes for every category, for example below are the hashes for 'category_1'
<category_1>:all: article_id <counter> // all time popular
<category_1>:2012: article_id <counter> // popular by year
<category_1>:2012-01: article_id <counter> // popular by month
<category_1>:2012-01-04: article_id <counter> // popular by week, where the date is the beginning of the week
Another set for 'category_2'
<category_2>:all: article_id <counter> // all time popular
<category_2>:2012: article_id <counter> // popular by year
<category_2>:2012-01: article_id <counter> // popular by month
<category_2>:2012-01-04: article_id <counter> // popular by week, where the date is the beginning of the week
So everytime the popularity of an articles goes up, I'll increment two sets of hashes, one for overall and the other for category the article belongs to. I am yet to figure out how to retrieve the most popular articles ( alltime, yearly, etc ) and not even sure if it will be possible using the 'hashes' data type.
Is hashes the correct datastructure for this ? Any thoughts on how to model a solution for this will helpful.
I think you look into using sorted sets instead of hashes. Basically, use the article_id as the member and popularity as score. Keep a sorted set for each time resolution & category permutation - just like what you've described with hashes. This will allow you to fetch articles (set members) by popularity (score) with a simple ZRANGEBYSCORE. To update the popularity, do a ZINCRBY.