MySQL database with unique fields ignored ending s

2019-01-12 01:32发布

My projects requires to start inputs from the user with the spacing on the left and spacing on the right of a word, for example 'apple'. If the user types in ' apple' or 'apple ', whether it is one space or multiple space on the left or right of the word, I need to store it that way.

This field has the Unique attribute, but I attempt to insert the word with spacing on the left, and it works fine. But when I attempt to insert the word with spacing on the right it trims off all the spacing from the right of the word.

So I am thinking of adding a special character to the right of the word after the spacing. But I am hoping there is a better solution for this issue.

CREATE TABLE strings
( id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
string varchar(255) COLLATE utf8_bin NOT NULL,
created_ts timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id), UNIQUE KEY string (string) )
ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin

标签: mysql mysql5
4条回答
爷的心禁止访问
2楼-- · 2019-01-12 01:54

The problem is that MySQL ignores trailing whitespace when doing string comparison. See http://dev.mysql.com/doc/refman/5.7/en/char.html

All MySQL collations are of type PADSPACE. This means that all CHAR, VARCHAR, and TEXT values in MySQL are compared without regard to any trailing spaces.

...

For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters will result in a duplicate-key error. For example, if a table contains 'a', an attempt to store 'a ' causes a duplicate-key error.

The section for the like operator gives an example for this behavior (and shows that like does respect trailing whitespace):

mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
+------------+---------------+
| 'a' = 'a ' | 'a' LIKE 'a ' |
+------------+---------------+
|          1 |             0 |
+------------+---------------+
1 row in set (0.00 sec)

Unfortunately the UNIQUE index seems to use the standard string comparison to check if there is already such a value, and thus ignores trailing whitespace. This is independent from using VARCHAR or CHAR, in both cases the insert is rejected, because the unique check fails. If there is a way to use like semantics for the UNIQUE check then I do not know it.

What you could do is store the value as VARBINARY:

mysql> create table test_ws ( `value` varbinary(255) UNIQUE );
Query OK, 0 rows affected (0.13 sec)

mysql> insert into test_ws (`value`) VALUES ('a');
Query OK, 1 row affected (0.08 sec)

mysql> insert into test_ws (`value`) VALUES ('a ');
Query OK, 1 row affected (0.06 sec)

mysql> SELECT CONCAT( '(', value, ')' ) FROM test_ws;
+---------------------------+
| CONCAT( '(', value, ')' ) |
+---------------------------+
| (a)                       |
| (a )                      |
+---------------------------+
2 rows in set (0.00 sec)

You better do not want to do anything like sorting alphabetically on this column, because sorting will happen on the byte values instead, and that will not be what the users expect (most users, anyway).

The alternative is to patch MySQL and write your own collation which is of type NOPAD. Not sure if someone wants to do that, but if you do, let me know ;)

查看更多
欢心
3楼-- · 2019-01-12 02:08

You probably need to read about the differences between VARCHAR and CHAR types.

The CHAR and VARCHAR Types

When CHAR values are stored, they are right-padded with spaces to the specified length. When CHAR values are retrieved, trailing spaces are removed unless the PAD_CHAR_TO_FULL_LENGTH SQL mode is enabled.

For VARCHAR columns, trailing spaces in excess of the column length are truncated prior to insertion and a warning is generated, regardless of the SQL mode in use. For CHAR columns, truncation of excess trailing spaces from inserted values is performed silently regardless of the SQL mode.

VARCHAR values are not padded when they are stored. Trailing spaces are retained when values are stored and retrieved, in conformance with standard SQL.

Conclusion: if you want to retain whitespace on the right side of a text string, use the CHAR type (and not VARCHAR).

查看更多
甜甜的少女心
4楼-- · 2019-01-12 02:10

Thanks to @kennethc. His answer works for me. Add a string length field to the table and to the unique key.

CREATE TABLE strings
( id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
string varchar(255) COLLATE utf8_bin NOT NULL,
created_ts timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
string_length int(3),
PRIMARY KEY (id), UNIQUE KEY string (string,string_length) )
ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin

In MySQL it's possible to update the string length field with couple of triggers like this:

CREATE TRIGGER `string_length_insert` BEFORE INSERT ON `strings` FOR EACH ROW SET NEW.string_length = char_length(NEW.string);
CREATE TRIGGER `string_length_update` BEFORE UPDATE ON `strings` FOR EACH ROW SET NEW.string_length = char_length(NEW.string);
查看更多
贼婆χ
5楼-- · 2019-01-12 02:13

This is not about CHAR vs VARCHAR. SQL Server does not consider trailing spaces when it comes to string comparison, which is applied also when checking a unique key constraint. So it is not that you cannot insert value with trailing spaces, but once you insert, you cannot insert another value with more or fewer spaces.

As a solution to your problem, you can add a column that keeps the length of the string, and make the length AND the string value as a composite unique key constraint.

In SQL Server 2012, you can even make the length column as a computed column so that you don't have to worry about the value at all. See http://sqlfiddle.com/#!6/32e94 for an example with SQL Server 2012. (I bet something similar is possible in MySQL.)

查看更多
登录 后发表回答