I'm busy upgrading one of my sites to Laravel 5.4 from 5.3. I noticed when going through the current Github repository that the default character set and collation has been changed from utf8
to utf8mb4
in order to provide support for emojis.
My current database (MariaDB 10.0.29) is currently set to use utf8
but I would however like to upgrade it to use utf8mb4
. Unfortunately, I haven't been able to find any documentation regarding this process.
Perhaps I'm overthinking it, but I would have thought that changing the character set and collation of a database would require some work, at very least running some ALTER TABLE
commands.
- Is it necessary to go change the character set and collation on the database and tables? Or would it suffice to simply change the settings in my
config/database.php
file? - If so, would anyone be able to provide an example migration (Or some MySQL code) on how to achieve this, bearing in mind that preserving existing data is essential.
Thanks
The database charset and collation is the default for newly created tables. The table settings are the defaults for columns.
Do this for each table:
Alrighty then, I have written a migration to achieve this for my own system.
It allows you to optionally specify a connection name to reference a connection other than the default.
It gets the list of tables from the connection's database using a
SHOW TABLES
query.It then loops through each table and updates all string/character type columns to the new character set and collation.
I have made it so that a callback must be provided to determine whether or not a column should have it's length changed to the provided new length. In my implementation,
VARCHAR
andCHAR
columns with lengths greater than 191 are updated to have length 191 during the up migration andVARCHAR
andCHAR
columns with length exactly 191 are updated to have length 255 on the reverse/down migration.Once all the string/character columns have been updated, a couple of queries will be ran to change the charset and collation of the table, converting any remaining collations to the new one and then to change the default charset and collation of the table.
Finally, the database's default charset and collation will be changed.
Notes
Originally, I tried to simply convert the tables to the new encoding but ran into issues with column lengths. 191 characters is the maximum character length in
utf8mb4
when using InnoDB in my version of MySQL/MariaDB and changing the table collation resulted in an error.I did at first only want to just update the lengths to the new length but I also wanted to provide a rollback feature, so this was not an option because in the reverse method I would have been setting the lengths of columns that were
utf8mb4
to 255, which would have been too long, so I opted to change the collation too.I then tried to just change the length, charset and collation of
varchar
andchar
columns that were too long, but in my system, this resulted in errors when I had multi-column indexes that included such columns. Apparently, multi-column indexes must use the same collation.An important note on this is that the reverse/down migration is not going to be 100% perfect for everyone. I don't think it would be possible to do so without storing extra information about the original columns when migrating. So my current implementation for the reverse/down migration is to assume that columns with length 191 were originally 255.
A similarly important note on this is that this will blindly change the collations of all string/character columns to the new collation, regardless of the original collation, so if there are columns with different collations, they will all be converted to the new one and the reverse will do the same, the originals won't be preserved.
Please, please, please back up your database before running this. Use at your own risk!