I want to insert a record into MySQL that has a non-ASCII Unicode character, but I'm on a terminal that doesn't let me easily type non-ASCII characters. How do I escape a Unicode literal in MySQL's SQL syntax?
问题:
回答1:
See: http://bugs.mysql.com/bug.php?id=10199 (Bug #10199: "Allow Unicode escape sequence for string literals.") This request has been "Open" since 2005.
From http://eng.kaching.com/2009/10/mysql-unicode-escape-sequences.html though you can see the following example, which does actually seem to work, but requires you to know the actual byte-by-byte UTF8 encoding:
"You can also use the variable-length UTF-8 representation (convenient when, for example, copying from a utf-8 URL-encoded value like %E2%80%98)."
mysql> select _utf8 x'E28098';
+---+
| ‘ |
+---+
回答2:
This stored function provides the functionality MySQL is (apparently) missing, with a way to turn a literal code point into a character without having to already know the UTF-8 encoding.
If VARCHAR(1)
seems strange, since utf8 characters in MySQL can be up to 3 bytes long, remember the size of VARCHAR
is characters, not bytes. The function returns a single UTF-8-encoded character from the input value.
For hexadecimal literals, prepend 0x
.
DELIMITER $$
DROP FUNCTION IF EXISTS `utf8_char` $$
CREATE FUNCTION `utf8_char`(v smallint unsigned) RETURNS VARCHAR(1) CHARSET utf8
NO SQL
DETERMINISTIC
BEGIN
-- http://stackoverflow.com/questions/3632410/mysql-unicode-literals/30675371#30675371
RETURN CHAR(CASE
WHEN v <= 0x7F THEN v
WHEN v <= 0x7FF THEN 0xC080 | ((v >> 6) << 8) | (v & 0x3F)
WHEN v <= 0xFFFF THEN 0xE08080 | (((v >> 12) & 0x0F ) << 16) | (((v >> 6) & 0x3F ) << 8) | (v & 0x3F)
ELSE NULL END);
END $$
DELIMITER ;
Example output:
mysql> select utf8_char(8592) AS 'leftwards_arrow';
+-----------------+
| leftwards_arrow |
+-----------------+
| ← |
+-----------------+
1 row in set (0.00 sec)
mysql> select utf8_char(0x2192) AS 'rightwards_arrow_hex';
+----------------------+
| rightwards_arrow_hex |
+----------------------+
| → |
+----------------------+
1 row in set (0.00 sec)