Store totals or calculate on the fly?

2019-07-16 01:40发布

I have many tables in my database that hold information on items (Photos, Articles, Videos) that can be Liked, Disliked, Shared, Favourited etc by users.

Each time a user takes an action on a item, it is recorded in a simple table like such:

ItemID | UserID  | Liked | Shared | Favourited
1        1         NULL    1        NULL
2        25        1       1        1
3        18        0       NULL     NULL

When I'm writing a query to return a list of items from a table (e.g. Photos) I also want to return the total number of Likes, Shares etc each item has. At the moment I'm calculating this on the fly using a nested SELECT statement. If my item tables grow in size by hundreds of thousands, and I constantly need stats on Likes, Shares etc, then would it be safe to continue calculating stats on the fly or should they be stored as totals somewhere in the database?

4条回答
贼婆χ
2楼-- · 2019-07-16 02:01

If it were me, I'd calculate it every once in a while and store the result in another table. It's not a big deal if the count isn't up-to-the-second accurate for something like "how many people like kittens".

查看更多
爷的心禁止访问
3楼-- · 2019-07-16 02:07

On the other hand, if you expect to get a million "Views", and you must COUNT(*) to retrieve that count, then Performance may overrule the 'evils'.

For high volume situations, I recommend either of the following:

Plan A: Count on the fly. UPDATE ViewCounts SET ct = ct + 1 WHERE page_id = ? Note that this is deliberately in a separate table from the rest of the meta info for the 'page'. This is to cut back on interference between the two. (Optionally, you also keep details of each 'view' in another table.)

Plan B: Have a table with info about each 'view', but every hour (or day?) perform COUNT(*) for the hour/day and put the result into a "Summary Table". Then SUM(subtotal) from that table to get the overall views. Such a table can also provide charting info for 'trends' in the views.

Note: Both of these Plans assume that the data will not change after the fact. Dealing with deletions from the raw 'views' table gets complex.

查看更多
我欲成王,谁敢阻挡
4楼-- · 2019-07-16 02:11

Storing totals is evil, because:

  • it is a form of caching, which itself is evil (potentially inconsistent, or at best non-trivial to do properly). See http://shouldiblamecaching.com for more info
  • it can be wrong! Unless you cover all data modification possibilities (say with triggers), there may be errors - that is the stored total may be incorrect. Wrong data is 1000 times worse than slow data
  • more complex: you've added another aspect to your system. Complexity is to be avoided
  • unnecessary or unfruitful - unless you have documented evidence that doing it via a query is unworkable, and you've d haunted all query optimization, you don't need to do it. Even if you think you do need it, you may not. For example, you may split your operation into using a dara warehouse for stable data (say more than 1 week old) which has stored totals for each day, and using the live system for everything more recent, which will cap the query tube to only scanning the most recent week's data (which should perform OK). If performance is still a problem, narrow the "live" window to a day. I did exactly this and it worked beautifully.
查看更多
别忘想泡老子
5楼-- · 2019-07-16 02:13

I would suggest NOT to store totals, as this seems a transactional Database and you will be inserting rows quite frequently, So if you store totals, every time you insert a row , you will need to update your totals.

So in reality each insert in the table will be followed by an update statement just to keep totals updated. sounds a very poor design for a transactional database.

Storing totals is a good option for a data warehouse where data hardly changes , if ever.

My suggestion would be creating Views, that will calculate totals for you on the fly. Add appropriate indexes to make these queries efficient. When your data grows too big that even indexes on table arent doing enough , consider indexed views.

查看更多
登录 后发表回答