I have a text file that contains Persian words and the file is saved using ANSI encoding. when I try to read the Persian words from the text file, I get some characters like '?'. In order to solve the problem, I wrote a method that change the file encoding to UTF8 and re-write the text file. the method:
public void Convert2UTF8(string filePath)
{
//first, read the text file with "ANSI" endocing
StreamReader fileStream = new StreamReader(filePath, Encoding.Default);
string fileContent = fileStream.ReadToEnd();
fileStream.Close();
//Now change the file encoding and replace it with the UTF8
StreamWriter utf8Writer = new StreamWriter(filePath.Replace(".txt", ".txt"), false, Encoding.UTF8);
utf8Writer.Write(fileContent);
utf8Writer.Close();
}
Now the first problem is solved; However, the main issue is in here: I want to insert the words into a table in SQL server database. I do this, but every time that I want to search a Persian word from the database table, the result is null while the record does exist in database table. What`s the solution to find my Persian words that exist in table. The code that I currently use is simply like:
SELECT * FROM [dbo].[WordDirectory]
WHERE Word = N'کلمه'
'Word' is the field that Persian words are saved in. The type of the field is NVARCHAR. My SQL server version is 2012. Should I change the collation or something?
Probably you have problem with Persian and Arabic versions of the 'ي' and 'ك' during search. These characters even look the same, have different Unicode numbers:
more info: http://www.dotnettips.info/post/90
The both Queries return the same result
You have to make sure that when you are inserting the values you prefix then witn
N
thats to tell sql server there can be unicode character in the passed string. Same is true when you are searching for them strings in Select statement.