Searching Persian characters and words in SQL serv

2019-07-11 05:27发布

I have a text file that contains Persian words and the file is saved using ANSI encoding. when I try to read the Persian words from the text file, I get some characters like '?'. In order to solve the problem, I wrote a method that change the file encoding to UTF8 and re-write the text file. the method:

    public void Convert2UTF8(string filePath)
    {
        //first, read the text file with "ANSI" endocing
        StreamReader fileStream = new StreamReader(filePath, Encoding.Default);
        string fileContent = fileStream.ReadToEnd();
        fileStream.Close();
        //Now change the file encoding and replace it with the UTF8
        StreamWriter utf8Writer = new StreamWriter(filePath.Replace(".txt", ".txt"), false, Encoding.UTF8);
        utf8Writer.Write(fileContent);
        utf8Writer.Close();
    }

Now the first problem is solved; However, the main issue is in here: I want to insert the words into a table in SQL server database. I do this, but every time that I want to search a Persian word from the database table, the result is null while the record does exist in database table. What`s the solution to find my Persian words that exist in table. The code that I currently use is simply like:

SELECT * FROM [dbo].[WordDirectory] 
WHERE Word = N'کلمه'

'Word' is the field that Persian words are saved in. The type of the field is NVARCHAR. My SQL server version is 2012. Should I change the collation or something?

2条回答
Anthone
2楼-- · 2019-07-11 06:19

Probably you have problem with Persian and Arabic versions of the 'ي' and 'ك' during search. These characters even look the same, have different Unicode numbers:

select NCHAR(1740),  -- Persian ى
       NCHAR(1610),  -- Arabic ي
       NCHAR(1705), -- Persian ك
       NCHAR(1603) -- Arabic ك

more info: http://www.dotnettips.info/post/90

查看更多
男人必须洒脱
3楼-- · 2019-07-11 06:20
DECLARE @Table TABLE(Field NVARCHAR(4000) COLLATE Frisian_100_CI_AI)

INSERT INTO @Table (Field) VALUES
(N'همهٔ افراد بش'),
(N'می‌آیند و حیثیت '),
(N'ميشه آهسته تر صحبت کنيد؟'),
(N'روح'),
(N' رفتار')   

SELECT * FROM @Table
WHERE Field LIKE N'%آهسته%'

The both Queries return the same result

RESULT Set:  ميشه آهسته تر صحبت کنيد؟

You have to make sure that when you are inserting the values you prefix then witn N thats to tell sql server there can be unicode character in the passed string. Same is true when you are searching for them strings in Select statement.

查看更多
登录 后发表回答