I have an application which is mostly based in czech language, that is why we uft8_czech_ci.
Given this example:
WHERE `firstName` = 'ales' collate utf8_czech_ci
I am unable to find result aleš
(which is common czech name). When I try this:
WHERE `firstName` = 'ales' collate utf8_general_ci
It successfuly finds aleš
. Is the utf8_czech_ci
definition in MySQL incorrect? I don't want to just blindly start using general_ci.
Thanks.
The reason is that š and s are two different letters in CZ alphabet, so that's that's why it's not found when using utf8_czech_ci
collation.
See also http://collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html and http://collation-charts.org/mysql60/mysql604.utf8_czech_ci.html
Another exhibit of the ordering for
utf8 : utf8_czech_ci
A=a=ª=À=Á=Á=Â=Ã=Ä=Å=à=á=á=â=ã=ä=å=Ā=ā=Ą=ą Aa ae az Æ=æ B=b C=c=Ç=ç cz Č=č
D=d=Ď=ď dz Ð=ð E=e=È=É=É=Ê=Ë=è=é=é=ê=ë=Ē=ē=Ĕ=ĕ=Ė=ė=Ę=ę=Ě=ě F=f fz ƒ
G=g=ğ=Ģ=ģ H=h hz ch I=i=Ì=Í=Í=Î=Ï=ì=í=í=î=ï=Ī=ī=Į=į=İ ij=ij iz ı J=j K=k=Ķ=ķ
L=l=Ļ=ļ lj=LJ=Lj=lj ll lz Ł=ł M=m N=n=Ñ=ñ=Ń=ń=Ņ=ņ=Ň=ň nz
O=o=º=Ò=Ó=Ó=Ô=Õ=Ö=ò=ó=ó=ô=õ=ö oe=Œ=œ oz Ø=ø P=p Q=q R=r Ř=ř S=s=ş sh ss=ß sz
Š=Š=š=š T=t=Ť=ť TM=tm=™ tz U=u=Ù=Ú=Ú=Û=Ü=ù=ú=ú=û=ü=Ū=ū=Ů=ů=Ų=ų ue uz V=v W=w
X=x Y=y=Ý=Ý=ý=ý=ÿ=Ÿ yz Z=z zh zz Ž=Ž=ž=ž Þ=þ µ
Versus utf8 : utf8_general_ci
A=a=À=Á=Á=Â=Ã=Ä=Å=à=á=á=â=ã=ä=å=Ā=ā=Ą=ą Aa ae az B=b C=c=Ç=ç=Č=č ch cz
D=d=Ď=ď dz E=e=È=É=É=Ê=Ë=è=é=é=ê=ë=Ē=ē=Ĕ=ĕ=Ė=ė=Ę=ę=Ě=ě F=f fz G=g=ğ=Ģ=ģ H=h
hz I=i=Ì=Í=Í=Î=Ï=ì=í=í=î=ï=Ī=ī=Į=į=İ=ı ij iz J=j K=k=Ķ=ķ L=l=Ļ=ļ lj ll lz M=m
N=n=Ñ=ñ=Ń=ń=Ņ=ņ=Ň=ň nz O=o=Ò=Ó=Ó=Ô=Õ=Ö=ò=ó=ó=ô=õ=ö oe oz P=p Q=q R=r=Ř=ř
S=s=ß=ş=Š=Š=š=š sh ss sz T=t=Ť=ť TM=tm tz U=u=Ù=Ú=Ú=Û=Ü=ù=ú=ú=û=ü=Ū=ū=Ů=ů=Ų=ų
ue uz V=v W=w X=x Y=y=Ý=Ý=ý=ý=ÿ=Ÿ yz Z=z=Ž=Ž=ž=ž zh zz
Æ=æ Ð=ð × Ø=ø Þ=þ ÷ ij Ł=ł Œ=œ ƒ LJ=Lj=lj