I have many tables in my database that hold information on items (Photos, Articles, Videos) that can be Liked, Disliked, Shared, Favourited etc by users.
Each time a user takes an action on a item, it is recorded in a simple table like such:
ItemID | UserID | Liked | Shared | Favourited
1 1 NULL 1 NULL
2 25 1 1 1
3 18 0 NULL NULL
When I'm writing a query to return a list of items from a table (e.g. Photos) I also want to return the total number of Likes, Shares etc each item has. At the moment I'm calculating this on the fly using a nested SELECT statement. If my item tables grow in size by hundreds of thousands, and I constantly need stats on Likes, Shares etc, then would it be safe to continue calculating stats on the fly or should they be stored as totals somewhere in the database?
If it were me, I'd calculate it every once in a while and store the result in another table. It's not a big deal if the count isn't up-to-the-second accurate for something like "how many people like kittens".
On the other hand, if you expect to get a million "Views", and you must
COUNT(*)
to retrieve that count, then Performance may overrule the 'evils'.For high volume situations, I recommend either of the following:
Plan A: Count on the fly.
UPDATE ViewCounts SET ct = ct + 1 WHERE page_id = ?
Note that this is deliberately in a separate table from the rest of the meta info for the 'page'. This is to cut back on interference between the two. (Optionally, you also keep details of each 'view' in another table.)Plan B: Have a table with info about each 'view', but every hour (or day?) perform
COUNT(*)
for the hour/day and put the result into a "Summary Table". ThenSUM(subtotal)
from that table to get the overall views. Such a table can also provide charting info for 'trends' in the views.Note: Both of these Plans assume that the data will not change after the fact. Dealing with deletions from the raw 'views' table gets complex.
Storing totals is evil, because:
I would suggest NOT to store totals, as this seems a transactional Database and you will be inserting rows quite frequently, So if you store totals, every time you insert a row , you will need to update your totals.
So in reality each insert in the table will be followed by an update statement just to keep totals updated. sounds a very poor design for a transactional database.
Storing totals is a good option for a data warehouse where data hardly changes , if ever.
My suggestion would be creating Views, that will calculate totals for you on the fly. Add appropriate indexes to make these queries efficient. When your data grows too big that even indexes on table arent doing enough , consider indexed views.