Searching for emojis in MySQL

2019-03-04 17:21发布

I have a string that looks like this:

Six emojis in a row

Now, when my app shoves this string into its utf8 mysql database column, it looks like this in the MySQL CLI:

String representation

If I select convert(mystring using utfmb4) it still looks like this.

And if I turn it to hex using select hex(mystring) from mytable;, it looks like this:

C3A2CB9CE282ACC3AFC2B8C28FC3B0C5B8C592CB86C3B0C5B8C592C5A0C3B0C5B8C592C281C3B0C5B8E280A1C2ACC3B0C5B8E280A1C2A7

Now, let's say I want to find strings with that emoji wave in it. Well, the hex for the wave emoji is F09F8C8A. But F09F8C8A isn't in the hex above so something like select * from mytable where hex(mystring) like '%F09F8C8A%'; doesn't work.

Any suggestions?

1条回答
forever°为你锁心
2楼-- · 2019-03-04 18:17

I call that "double encoding". Your client claimed it was getting latin1 characters, but told MySQL that they should be utf8, so a 3-byte utf8 character got converted to 6 bytes in the database.

You need to fix both the client and the data in the table(s). This link discusses it: http://mysql.rjweb.org/doc.php/charcoll . (Sorry, there is no brief summary of how to fix your problems.) The issues and the fixes.

查看更多
登录 后发表回答