How to conduct an Accent Sensitive search in MySql

2019-01-01 11:16发布

问题:

I have a MySQL table with utf8 general ci collation. In the table, I can see two entries:

abad
abád

I am using a query that looks like this:

SELECT *  FROM `words` WHERE `word` = \'abád\'

The query result gives both words:

abad
abád

Is there a way to indicate that I only want MySQL to find the accented word? I want the query to only return

abád

I have also tried this query:

SELECT *  FROM `words` WHERE BINARY `word` = \'abád\'

It gives me no results. Thank you for the help.

回答1:

If your searches on that field are always going to be accent-sensitive, then declare the collation of the field as utf8_bin (that\'ll compare for equality the utf8-encoded bytes) or use a language specific collation that distinguish between the accented and un-accented characters.

col_name varchar(10) collate utf8_bin

If searches are normally accent-insensitive, but you want to make an exception for this search, try;

WHERE col_name = \'abád\' collate utf8_bin


回答2:

In my version (MySql 5.0), there is not available any utf8 charset collate for case insensitive, accent sensitive searches. The only accent sensitive collate for utf8 is utf8_bin. However it is also case sensitive.

My work around has been to use something like this:

SELECT * FROM `words` WHERE LOWER(column) = LOWER(\'aBád\') COLLATE utf8_bin


回答3:

The MySQL bug, for future reference, is http://bugs.mysql.com/bug.php?id=19567.



回答4:

I was getting the same error.

I\'ve changed the collation of my table to utf8_bin (through phpMyAdmin) and the problem was solved.

Hope it helps! :)



回答5:

Check to see if the database table collation type end with \"_ci\", This stands for case insensitive...

Change it to collation the the same or nearest name without the \"_ci\" ...

For example... change \"utf8_general_ci\" to \"utf8_bin\" Mke



回答6:

SELECT *  FROM `words` WHERE column = \'abád\' collate latin1_General_CS 

(or your collation including cs)



回答7:

You can try searching for the hex variable of the character, HEX() within mysql and use a similar function within your programming language and match these. This worked well for me when i was doing a listing where a person could select the first letter of a person.



回答8:

Well, you just described what utf8_general_ci collation is all about (a, á, à, â, ä, å all equals to a in comparison).

There have also been changes in MySQL server 5.1 in regards to utf8_general_ci and utf8_unicode_ci so it\'s server version dependent too. Better check the docs.

So, If it\'s MySQL server 5.0 I\'d go for utf8_unicode_ci instead of utf8_general_ci which is obviously wrong for your use-case.



回答9:

Accepted answer is good, but beware that you may have to use COLLATE utf8mb4_bin instead!

WHERE col_name = \'abád\' collate utf8mb4_bin

Above fixes errors like:

MySQL said: Documentation 1253 - COLLATION \'utf8_bin\' is not valid for CHARACTER SET \'utf8mb4\'



回答10:

That works for me for an accent insensitive and case insensitive search in MySql server 5.1 in a database in utf8_general_ci, where column is a LONGBLOB.

select * from words where \'%word%\' LIKE column collate utf8_unicode_ci

with

select * from words where\'%word%\' LIKE column collate utf8_general_ci

the result is case sensitive but not accent sensitive.



标签: mysql utf-8