Is it good practice to use serialize in PHP in ord

2019-01-25 07:38发布

问题:

I came across an interesting comment in php.net about serialize data in order to save it into the DB.

It says the following:

Please! please! please! DO NOT serialize data and place it into your database. Serialize can be used that way, but that's missing the point of a relational database and the datatypes inherent in your database engine. Doing this makes data in your database non-portable, difficult to read, and can complicate queries. If you want your application to be portable to other languages, like let's say you find that you want to use Java for some portion of your app that it makes sense to use Java in, serialization will become a pain in the buttocks. You should always be able to query and modify data in the database without using a third party intermediary tool to manipulate data to be inserted.

I've encountered this too many times in my career, it makes for difficult to maintain code, code with portability issues, and data that is it more difficult to migrate to other RDMS systems, new schema, etc. It also has the added disadvantage of making it messy to search your database based on one of the fields that you've serialized.

That's not to say serialize() is useless. It's not... A good place to use it may be a cache file that contains the result of a data intensive operation, for instance. There are tons of others... Just don't abuse serialize because the next guy who comes along will have a maintenance or migration nightmare.

I would like to know if this is a standard view about using serializing data for DB purposes. Meaning if it's a good practice to use it sometimes, or if it should be avoided.

For example, I was instructed to use serialize myself recently.

In this case the data we had to save into a MySQL table was the following:

  • Car brand.
  • Car model.
  • Car version.
  • Car info.

Car info was an array representing all the properties of a version, so it was a large variable amount of properties (under 100 properties). This array was the one to be serialized.

The main reason I was given in order to use serialize was the following:

Being a large number of fields, it is better to serialize the data in order to improve performance instead of creating a field for each property or multiple tables.

Personally I agree more with the commentary in php.net than with this last asseveration, but I would like to here more qualified opinions than mine about this.

回答1:

Being a large number of fields, it is better to serialize the data in order to improve performance instead of creating a field for each property or multiple tables.

I would consider this highly dependent on the use case. What if there is a class Customer that wants to have infos about all cars that are running Diesel or any other specific data for the car (using fuel seems easiest). You would need to get all the cars from the database, unserialize it, check for the propery and keep the list with all cars relevant for the customer.

Example: We had to move some person-related data from an old customer CMS to a new one. Instead of having each attribute nicely mapped on the database, the whole information was a single string in the old database. So instead of using a proper database structure, we had to do lots of regex-foo to turn the data into a proper structure again. Of course, this was an expensive (both monetary and work-load) task. In this case, the problem was not that huge since the amount of data was managable. But imagine the same scenario with millions of rows and more than just a single string....

The comment you posted is only talking about data structures IMO. And I agree, storing these is not very good nor efficient. It will be much easier to have a typo somewhere or add a new property that other parts of the language are not aware of. This WILL leed to problems sooner or later.

On the other hand, storing some configs that are more easily ported might be an OK case for serializing data. You could argue that there external setting files are more ideal for such a case, but this will be highly dependent on the case/philosophy/customer/...

TL;DR In most cases, using a proper schema will sooner or later benefit the whole development, speed wise and complexity wise (since I preferr reading many table descriptions instead of a huge, cryptic string). There might be some use-cases where serializing data is acceptable so giving a finite answer if this is good or bad practice is not that easy and highly dependent.