Hebrew chars in C++ (cout<<char<<char;

2019-08-12 08:53发布

问题:

I'm trying to work with hebrew chars in C++ , using Clion on mac.

char notification[140]={"א"}; //this is ALEF the first letter of Hebrew ABC.

for(int i=0; i < strlen(notification); i++) {
    cout << (int)notification[i] << endl;
} //Here I want to see what is the ASCII code for this letter.

the output for this for is :

-41
-112

Though there is only 1 char entered.

cout << char(-41) << char(-112) << endl; // this one gives me the output of the letter ALEF

cout << char(-41) << char(-111) << endl; //gives the second letter of Hebrew ABC.

I can't understand how it works why there is 2 chars to present 1 hebrew char ?

回答1:

You see the UTF8 code for "א". but apparently your terminal not support this charset or UTF8. (-41,-112) = (0xd7, 0x90)

Look here for UTF8 hebrew characters

You need to find how to configure the terminal to support Hebrew charset and UTF8.

maybe this can help



回答2:

There are several sub-problems here.

a)
You need your data in some Unicode format, instead of ASCII-based one-byte-characters. You have that already, but if not, no programming language feature of the world will do this automatically for you.

b)
As you have UTF8, depending on what you're doing, std::string etc. can handle the data well.
Eg.

  • input and output from/to files is ok
  • getting the used byte length is ok
  • (input/output to the terminal depends on the used terminal)
    ...

What is a problem is eg.

  • counting how much characters (not bytes) are there
  • accessing single characters with varname[number]
  • Stuff like Unicode normalization

... for such things, you'll need some more coding and/or external libs like ICU.

c)
Your terminal needs to support UTF8 if you want to print such stirngs directly to it (or read input from the user). This depends completely on the used OS and it's configuration, The C++ part can't help here. See eg. OS X Terminal UTF-8 issues



标签: c++ char hebrew