Is it safe to assume that all collations in SQL Server will give the "expected" (i.e. numeric) sort order on strings containing integers of the same length? For example, assuming that @text
contains just non-negative integers ([0-9]+
), would the following snippet work for ensuring that the value does not overflow the int
range, or might there be some collation where @text <= '2147483647'
gives unexpected results?
IF LEN(@text) BETWEEN 1 AND 9
OR LEN(@text) = 10 AND @text <= '2147483647'
-- ...
There won't be cases of variable-length comparisons (e.g. '2' < '11'
), so please do not address that issue.
SQL Server collations do not guarantee anything about the encodings. They are mappings from binary representations of characters to the commonly understood characters.
For this purpose, I think the relevant concept is "code page":
The "ordered" piece is the part of interest for this problem. The ordering determines whether the characters are "naturally" ordered for numbers.
So, the concept of collations does not requires that numbers be naturally ordered.
That said, I am not aware of any collation anywhere where the digits
'0'
-'9'
are not ordered naturally. In the Unicode standard, numbers are ordered naturally. I cannot think of a reason why anyone would create such a collation. So, in practice, I would be very, very surprised if such a collation existed. And, if it did, it would probably not be Unicode-compliant and so would not be available in SQL Server.In practice, no such collation exists or at least no such collation is in wide-spread use. Natural ordering is not guaranteed by the definition of a collation, but is part of the Unicode character sets. You are pretty safe in assuming that all collations have natural ordering of numbers, but it is theoretically possible to create a character set with non-natural ordering of digits.
Like I said in my comment I don't think you are going to find hard documentation that all collations sort numbers the same as that is not a requirement of the collation. But it would be reasonable to assume/expect that. I cannot think of a single use case where I would not expect numbers to sort the same.
Kind of grunge but test them all
The first select just creates the selects
Then copy paste to run the output as selects
There is probably a way to do this in loop I could not get it to use collate to a variable
None failed this simple '54321' < '22222' or N'22222' < N'22221' test for me
Response to your comment
So if you tested all 4 digit combinations you are afraid two digits in position 7 and 8 of 10 might behave differently? Really if you are that concerned then even if you found documentation that all collations must sort numeric the same that does not mean they implemented properly.
What I would do is a random order table of all 5 digits and random order table of all 4 digit (so you have odd and even). Have each collation sort and compare to known good sort.
That would run in 4-8 hours on even a mid range server. This question has a 5 day bounty.
Is it safe to assume that all collations in SQL Server will give the "expected" (i.e. numeric) sort order on strings containing integers of the same length? = yes
...or might there be some collation where @text <= '2147483647' gives unexpected results? =no