var charArray = text.ToCharArray();
var isChineseTextPresent = false;
foreach (var character in charArray)
{
var cat = char.GetUnicodeCategory(character);
if (cat != UnicodeCategory.OtherLetter)
{
continue;
}
isChineseTextPresent = true;
break;
}
According to the information provided here in unicode website you can find the block of Chinese or any other language and then implement a parser to check if a word is in the range or no. just like
public bool IsChinese(string text)
{
return text.Any(c => c >= 0x20000 && c <= 0xFA2D);
}
Note that
As a handy reference, the Unicode Consortium here provides a search interface to the Unicode Hàn (漢) Database (Unihan).
The database link I'd provided above is showing you the characters
According to the wikipedia (https://en.wikipedia.org/wiki/CJK_Compatibility) there are several character code diapasons.
Here is my approach to detect Chinese characters based on link above (code in F#, but it can be easily converted)
This worked for me:
According to the information provided here in unicode website you can find the block of Chinese or any other language and then implement a parser to check if a word is in the range or no. just like
Note that
The database link I'd provided above is showing you the characters
According to the wikipedia (https://en.wikipedia.org/wiki/CJK_Compatibility) there are several character code diapasons. Here is my approach to detect Chinese characters based on link above (code in F#, but it can be easily converted)
in unicode, chinese, japan, and Korean characters are encoded together.
visit this FAQ: http://www.unicode.org/faq/han_cjk.html
chinese character are distributed in serveral blocks.
visit this wiki: https://en.wikipedia.org/wiki/CJK_Unified_Ideographs
You will find there are serveral cjk character charts in unicode website.
For simplicity, You can just use chinese character minimum and maximum range:
0x4e00 and 0x2fa1f to check.
You can use regular expression to match with Supported Named Blocks:
Then, you can use:
Just check the characters to see if the codepoints are in the desired range(s). For exampe, see this question:
What's the complete range for Chinese characters in Unicode?