SQL Server Standard 64 Bit with collation SQL_Latin1_General_CP1_CS_AS
Table plz
:
ort varchar(30) SQL_Latin1_General_CP1_CS_AS
select ort,
from plz
where
ort >= 'zürich'
and ort <= 'zürichz'
Selects this data:
Zürich
Zürich Mülligen
Zürich 80
Without the z
at the end of second zürich no data are selected which is ok. But why does it show data on case sensitive server?
When comparing strings, one of the first things that SQL Server does is to pad the shorter string with spaces so that the strings are of the same length. So basically you're query is trying to find any strings that match
zürich
and then the next character must be one that can appear between(from your first string) and
z
(from your second string) - which includesitself and most alphabetical characters.
This has nothing to do with case-sensitivity.
Without the
z
in your second string, the only strings that will match are those with the valuezürich
and just trailing spaces.There are two confusions happening here:
Equality vs Sorting
When using the equality operator
=
the issue of case-sensitivity is more straight-forward and obvious (e.g. "z" <> "Z"). But when using the>
and<
operators sorting is required and this is separate from, though influenced by, case-sensitive vs case-insensitive. Sorting is determined first by which type of ordering (see next point) and then possibly determined by case-sensitivity.Dictionary order vs Binary (i.e. ASCII value / Code Point) order
Dictionary ordering, the default, means that a particular languages alphabetical ordering will be used. The particular language is part of the collation name, such as
Latin1_General
orDanish
. This will put "a" (ASCII = 97) and "A" (ASCII = 65) together and "Z" (ASCII = 90) after both. A binary collation (i.e. one ending in_BIN
or_BIN2
) will use the binary character value, which will put "Z" between "A" and "a". A binary ordering is necessarily case-sensitive as it is everything-sensitive. A dictionary ordering will group "A" and "a" together, but it will only enforce that "A" comes before "a" if the collation is also case-sensitive. A case-insensitive dictionary ordering can intermix "A" and "a" values (see final SELECT in the example).Now let's tie this all back to the question. The current collation is
SQL_Latin1_General_CP1_CS_AS
which is a Dictionary ordering. And when looking in a dictionary, both Zürich and zürich are going to be found back-to-back; Zürich would not come before zany simply because it has a capital Z.Looking at the results returned by your query, the first entry, Zürich should not be there. I cannot get that to be returned. Since this is a case-sensitive collation, Zürich does come before zürich and would be filtered out by the
WHERE
clause. But the other two values, Zürich Mülligen and Zürich 80 are valid to be returned as they do match theWHERE
clause. If you want them to not match, then you need to use a binary collation instead of the case-sensitive dictionary one.To see this in action, run the following:
Dictionary (Case-InSensitive):
Results:
Dictionary (Case-Sensitive):
Results:
Binary:
Results:
To get a better sense of how case-sensitivity affects filtering and sorting in general, run the following step-by-step example.
See the following links for some SQL Server collation info: