Querying a SQL Server 2008 table to find values in

2020-02-09 19:33发布

I've run into a problem in a project I'm working on: some of the string values in a specific SQL Server 2008 table column contain Unicode characters. For example, instead of a dash some strings will instead contain an EM DASH (http://www.fileformat.info/info/unicode/char/2014/index.htm).

The column values that contain Unicode characters are causing problems when I send HTTP requests to a third-party server. Is there a way to query what rows contain one-or-more Unicode characters, so I can at least begin to identify how many rows need to be fixed?

3条回答
Lonely孤独者°
2楼-- · 2020-02-09 20:23
SELECT *
FROM your_table
WHERE your_column LIKE N'%[^ -~]%' collate Latin1_General_BIN

finds all strings that contain one or more characters within ASCII characters 32-126.

I thought the purpose was to find strings where ASCII characters are not in the range 32-126?

NOT is possible with LIKE. Wouldn't this work?

SELECT *
FROM your_table
WHERE your_column NOT LIKE N'%[^ -~]%'

No collate required.

查看更多
唯我独甜
3楼-- · 2020-02-09 20:27

You want to find all strings that contain one or more characters outside ASCII characters 32-126.

I think this should do the job.

SELECT *
FROM your_table
WHERE your_column LIKE N'%[^ -~]%' collate Latin1_General_BIN
查看更多
别忘想泡老子
4楼-- · 2020-02-09 20:30

One way you can do it is to see which rows no longer equal themselves when converted to a datatype that doesn't support unicode.

CREATE TABLE myStrings (
    string nvarchar(max) not null
)

INSERT INTO myStrings (string)
SELECT 'This is not unicode' union all
SELECT 'This has '+nchar(500)+' unicode' union all
SELECT 'This also does not have unicode' union all
SELECT 'This has lots of unicode '+nchar(600)+nchar(700)+nchar(800)+'!'

SELECT cast(string as varchar)
FROM myStrings

SELECT *
FROM myStrings
WHERE cast(cast(string as varchar(max)) as nvarchar(max)) <> string
查看更多
登录 后发表回答