How to compare Unicode characters in SQL server?

2019-08-13 03:13发布

问题:

Hi I am trying to find all rows in my database (SQL Server) which have character é in their text by executing the following queries.

SELECT COUNT(*) FROM t_question WHERE patindex(N'%[\xE9]%',question) > 0;

SELECT COUNT(*) FROM t_question WHERE patindex(N'%[\u00E9]%',question) > 0;

But I found two problems: (a) Both of them are returning different number of rows and (b) They are returning rows which do not have the specified character.

Is the way I am constructing the regular expression and comparing the Unicode correct?

EDIT:

The question column is stored using datatype nvarchar. The following query gives the correct result though.

SELECT COUNT(*) FROM t_question WHERE question LIKE N'%é%';

回答1:

Why not use SELECT COUNT(*) FROM t_question WHERE question LIKE N'%é%'?

NB: Likeand patindex do not accept regular expressions.

In the SQL Server pattern syntax [\xE9] means match any single character within the specified set. i.e. match \, x, E or 9. So any of the following strings would match that pattern.

  • "Elephant"
  • "axis"
  • "99.9"


回答2:

This might help, about collations:

http://msdn.microsoft.com/en-us/library/ms143508(SQL.90).aspx

http://ask.sqlservercentral.com/questions/1598/understanding-patindex-query