Is there a way to get a case-sensitive version of a collation to use in a query?
Let's say that the query could be used on databases with different collations, some which are case-insensitive, and can have different cultures. (multiple clients for example)
However, this query should always behave in a case-sensitive manner, while, if possible, not changing the collation culture and other properties.
For example, if a DB happens to be using SQL_Latin1_General_CP1_CI_AS (CI here stands for Case Insensitive), I would like to use SQL_Latin1_General_CP1_CS_AS (CS for Case Sensitive).
Simplistic query example:
DECLARE @Title nvarchar(2) = 'qQ'
--Case insensitive (following DB collation)
SELECT REPLACE(@Title, 'q', 'o') --Result: 'oo'
--Case sensitive, but fixed to a collation
SELECT REPLACE(@Title COLLATE SQL_Latin1_General_CP1_CS_AS, 'q', 'o') --Result: 'oQ'
Fixing a collation like this in the query could cause problems when migrating the code, or changing the DB collation at a latter date.
Is there a built-in function to get the case-sensitive version of the current collation, or a workaround that could be used for this?
Collations are not necessarily determined by the Database default value: they can be set per string field as well.
No, I have never seen a way (and I have looked) to do dynamic collations outside of using Dynamic SQL to write the COLLATE
clause into a query. Or, if the number of options you need to account for are fairly minimal, you could maybe try something like the following:
SELECT ...
FROM ...
WHERE (@CaseSensitive = 1 AND [Field] LIKE N'%' + @Name + N'%' COLLATE Something_CS_AS)
OR (@CaseSensitive = 0 AND [Field] LIKE N'%' + @Name + N'%')
Also, there is no direct equivalence between Case (or even Accent, Kana, or Width) sensitive and insensitive. While most of the time there is a case-sensitive counterpart to a case-insensitive collation, there are 15 collations that are case-insensitive-only:
;WITH CaseS AS
(
SELECT [name]
FROM sys.fn_helpcollations()
WHERE [name] LIKE N'%[_]cs[_]%'
)
SELECT CaseI.*
FROM sys.fn_helpcollations() CaseI
LEFT JOIN CaseS
ON CaseI.name = REPLACE(CaseS.[name], N'_CS_', N'_CI_')
WHERE CaseI.[name] LIKE N'%[_]ci[_]%'
AND CaseS.[name] IS NULL;
Returns:
name description
SQL_1xCompat_CP850_CI_AS ...
SQL_AltDiction_CP850_CI_AI ...
SQL_AltDiction_Pref_CP850_CI_AS ...
SQL_Danish_Pref_CP1_CI_AS ...
SQL_Icelandic_Pref_CP1_CI_AS ...
SQL_Latin1_General_CP1_CI_AI ...
SQL_Latin1_General_CP1253_CI_AI ...
SQL_Latin1_General_CP437_CI_AI ...
SQL_Latin1_General_CP850_CI_AI ...
SQL_Latin1_General_Pref_CP1_CI_AS ...
SQL_Latin1_General_Pref_CP437_CI_AS ...
SQL_Latin1_General_Pref_CP850_CI_AS ...
SQL_Scandinavian_Pref_CP850_CI_AS ...
SQL_SwedishPhone_Pref_CP1_CI_AS ...
SQL_SwedishStd_Pref_CP1_CI_AS ...
Fixing a collation like this in the query could cause problems when migrating the code,
Why? Where are you planning on migrating the code to? If to another RDBMS, then you already need to contend with datatype differences, SQL dialect differences, "best practices" differences, etc. So why worry about collations? Unless you know for certain that you will be migrating to another RDBMS, you should make your system work as best as it can by using your current platform to the best of its abilities, rather than existing in a less-than-optimal state due to only using lowest-comment-denominator functionality.
or changing the DB collation at a latter date.
Why would you do this? Again, any string fields with an explicit COLLATION setting are not affected by the database default.
If you are looking for strict Case (and everything including Accent, etc) sensitivity on equivalence (we are not talking about range searches or sorting), then you can use a Binary collation (i.e. one ending in either _BIN
or _BIN2
). Just keep in mind that binary collations might not sort the way you might expect since they are not "dictionary" based sorts, at least not in terms of a single binary collation that would behave the same across all languages. They also don't make equivalences between languages (i.e. equating "a" with an "a" that has an accent).
Since the original posting of this answer I have discovered that the paragraph above is actually bad advice. Please do not use a binary collation if the goal is case-sensitivity. It is too strict and in many cases will not give accurate results.
Also, please do not use binary collations ending in just _BIN
as they have been deprecated since SQL Server 2005 was released and should only be used when needing to maintain backwards compatibility with another system also using a _BIN
collation. If you need a binary collation, use one ending in _BIN2
.
UPDATE
I was able to come up with a function to get the case sensitive version, if one exists, of the passed-in collation. This function, however, will only assist in creating the correct Dynamic SQL; it cannot be used inline in a query to set the COLLATE clause dynamically (mainly because that cannot be done). There are two parameters:
@CollationName
-- if you pass this in, you will get back the case-sensitive version of it, if one exists. The @DatabaseName
param will be ignored.
@DatabaseName
-- if you don't know the exact collation, leave @CollationName
as NULL
and pass this in and it will look up the default collation for that database.
- If both params are
NULL
then it will look up the default collation for the database that the function exists in.
- If the passed-in or looked-up collation is already case-sensitive, that name will be returned
- TO DO (when I have time): look up server default collation for databases that do not have a default (they will have
NULL
as their default collation name)
There are two versions of the function: the first is a TVF (as those are faster) and a Scalar UDF (as those are sometimes easier to interact with).
Table-Valued Function:
USE [Test];
SET ANSI_NULLS ON;
IF (OBJECT_ID(N'dbo.GetCaseSensitiveCollation') IS NOT NULL)
BEGIN
DROP FUNCTION dbo.GetCaseSensitiveCollation;
END;
GO
CREATE FUNCTION dbo.GetCaseSensitiveCollation
(
@CollationName sysname,
@DatabaseName sysname
)
RETURNS TABLE
--WITH SCHEMABINDING
-- Cannot schema bind table valued function 'dbo.GetCaseSensitiveCollation'
-- because it references system object 'sys.fn_helpcollations'.
AS RETURN
WITH collation(name) AS
(
SELECT CONVERT(sysname, COALESCE(@CollationName,
DATABASEPROPERTYEX(COALESCE(@DatabaseName, DB_NAME()), 'Collation')))
)
SELECT col.[name]
FROM sys.fn_helpcollations() col
CROSS JOIN collation
WHERE col.[name] = CASE WHEN collation.[name] LIKE N'%[_]CS[_]%'
THEN collation.[name]
ELSE REPLACE(collation.[name], N'_CI_', N'_CS_')
END;
GO
Examples:
-- Get CS Collation for the specified Collation
SELECT [name] AS [BySpecificCollation]
FROM dbo.GetCaseSensitiveCollation(N'Indic_General_100_CI_AS_KS_WS', NULL);
-- Get CS Collation based on database default for the specified database
SELECT [name] AS [ByDefaultCollationForDB]
FROM dbo.GetCaseSensitiveCollation(NULL, N'msdb');
-- Get CS Collation based on database default for database that the function exists in
SELECT [name] AS [CurrentDB]
FROM Test.dbo.GetCaseSensitiveCollation(NULL, NULL);
-- Get CS Collation based on database default for the current database
USE [ReportServer];
SELECT [name] AS [CurrentDB]
FROM Test.dbo.GetCaseSensitiveCollation(NULL, DB_NAME());
Scalar User-Defined Function:
USE [Test];
SET ANSI_NULLS ON;
IF (OBJECT_ID(N'dbo.GetCaseSensitiveCollation2') IS NOT NULL)
BEGIN
DROP FUNCTION dbo.GetCaseSensitiveCollation2;
END;
GO
CREATE FUNCTION dbo.GetCaseSensitiveCollation2
(
@CollationName sysname,
@DatabaseName sysname
)
RETURNS sysname
--WITH SCHEMABINDING
-- Cannot schema bind table valued function 'dbo.GetCaseSensitiveCollation2'
-- because it references system object 'sys.fn_helpcollations'.
AS
BEGIN
DECLARE @NewCollationName sysname;
;WITH collation(name) AS
(
SELECT CONVERT(sysname, COALESCE(@CollationName,
DATABASEPROPERTYEX(COALESCE(@DatabaseName, DB_NAME()), 'Collation')))
)
SELECT @NewCollationName = col.[name]
FROM sys.fn_helpcollations() col
CROSS JOIN collation
WHERE col.[name] = CASE WHEN collation.[name] LIKE N'%[_]CS[_]%'
THEN collation.[name]
ELSE REPLACE(collation.[name], N'_CI_', N'_CS_')
END;
RETURN @NewCollationName;
END;
GO
Examples:
/* Get CS Collation for the specified Collation */
SELECT dbo.GetCaseSensitiveCollation2(N'Indic_General_100_CI_AS_KS_WS', NULL)
AS [BySpecificCollation];
-- Indic_General_100_CS_AS_KS_WS
/* Get CS Collation based on database default for the specified database */
SELECT dbo.GetCaseSensitiveCollation2(NULL, N'msdb') AS [ByDefaultCollationForDB];
-- SQL_Latin1_General_CP1_CS_AS
/* Get CS Collation based on database default for the current database */
USE [ReportServer];
SELECT Test.dbo.GetCaseSensitiveCollation2(NULL, DB_NAME()) AS [CurrentDB];
-- Latin1_General_CS_AS_KS_WS
/* Get CS Collation based on database default for database where the function exists */
SELECT Test.dbo.GetCaseSensitiveCollation2(NULL, NULL) AS [DBthatFunctionExistsIn];
-- SQL_Latin1_General_CP1_CS_AS