replace only matches the beginning of the string

2019-07-17 05:06发布

问题:

I'm trying to write a function to replace the Romanian diacritic letters (ĂÂÎȘȚ) to their Latin letter equivalents (AAIST, respectively).

SQL Server's replace function deals with Ă, Â, and Î just fine.

It seems to have a weird problem with Ș and Ț, though: they are only replaced if they are found at the beginning of the string.

For example:

select replace(N'Ș', N'Ș', N'-')
-- '-'   # OK

select replace(N'ȘA', N'Ș', N'-')
-- '-A'  # OK

select replace(N'AȘ', N'Ș', N'-')
-- 'AȘ'  # WHAT??

select replace(N'ȘAȘ', N'Ș', N'-')
-- '-AȘ' # WHAT??

I managed to reproduce this behavior on both SQL Sever 2008 R2 and SQL Server 2012.

Is there an explanation for these seemingly weird results? Or could it be just a bug?

My default database collation is SQL_Latin1_General_CP1_CI_AS.

回答1:

It is a collation problem.
At first look, I had to reproduce, because it was unbelivable, but your query had the same problem for me.

If you try with a proper collation it works:

select replace(N'AȘ' COLLATE Latin1_General_BIN, N'Ș', N'-')