MySQL - Calculating fields on the fly vs storing c

2019-03-15 01:38发布

问题:

I apologise if this has been asked before, but I can't seem to find an answer to a question that I have about calculating on the fly vs storing fields in a database.

I read a few articles that suggested it was preferable to calculate when you can, but I would just like to know if that still applies to the following 2 examples.

Example 1. Say you are storing data relating to a car. You store the fuel tank size in litres, and how many litres it uses per 100km. You also want to know how many KMs it can travel, which can be calculated from the tank size and economy. I see 2 ways of doing this:

  1. When a car is added or updated, calculate the amount of KMs and store this as a static field in the database.
  2. Every time a car is accessed, calculate the amount of KMs on the fly.

Because the cars economy/tank size doesn't change (although it could be edited), the KMs is a pretty static value. I don't see why we would calculate it every single time the car is accessed. Wouldn't this waste cpu time as opposed to simply storing it in a separate field in the database and calculating only when a car is added or updated?

My next example, which is almost an entirely different question (but on the same topic), relates to counting children.

Let's say we have a app which has categories and items. We have a view where we display all the categories, and a count of all the items inside each category. Again, I'm wondering what's better. To perform a MySQL query to count all the items in each category every single time the page is accessed? Or store the count in a field in the categories table and update when an item is added / deleted?

I know it is redundant to store anything that can be calculated, but I worry that calculating fields or counting records might be slow as opposed to storing the data in a field. If it's not then please let me know, I just want to learn about when to use either method. On a small scale I guess it wouldn't matter either way, but apps like Facebook, would they really count the amount of friends you have every time someone views your profile or would they just store it as a field?

I'd appreciate any responses to both of these scenarios, and any resource that might explain the benefits of calculating vs storing.

Thanks in advance,

Christian

回答1:

One thing to notice is the way you use your data. If several applications, or several layers of your application (maybe old code and new code in the same app) is accessing your data you'll reduce the risk of errors in computing by pre-calculating in the database. Then your calculated data will always be the same, no matter which application is requesting it.

For your first example, there is no reason that someone someday will have to change the way your KMs will need to be computed. I would store it in database (via triggers or via PHP on the insert/update -- because MySQl triggers are... well they are... not as good as some other DB triggers).

Now if we taking your second example it's really not sure someone will not want some day to add some filters on that categories computing. For example, take only children which are between 2 and 5. Then all your pre-computed results serves nothing. If you need some optimizations and caches of theses things it's maybe more an application-layer cache you would need, something like memcache, or pre-computed results stored in a cache table. But this cache is an application cache, which is related in a certain way on your application parameters (requests with different filters would use a different record in the cache).

Note that with MySQl you've got as well a nice query cache which will prevent the same query to be computed too much.



回答2:

Introducing redundancy into the database is a valid means of optimization. As with all optimizations, don't do it unless you have confirmed that this is where the bottleneck actually is.



回答3:

Others have touched on the technical aspects, so let me give you another viewpoint to consider:

For every anomaly you introduce, you are making the development process slower.

Denormalized data, aggregates, prejoined data etcetera are all examples of stuff that greatly complicates development, because you have to:

  • Keep rewriting the aggregation logic whenever you change the detailed tables
  • Test more (and often seemingly unrelated parts of your application)
  • Write more documentation
  • Complicates upgrades and patches

In many cases, it's worth it and in some cases absolutely necessary, but it would be very stupid to sacrifice development speed if you don't have to.



回答4:

In both examples, the values you're talking about are static, and calculating static values is just a nonsense. Furthermore, if we assume that the tables are more queried than updated, calculating data is also a loss of performance.



标签: php mysql field