Could someone tell me how can I write an arabic string in C++ ?? , I am using Codeblocks.
I searched all over the internet to see how can I put an arabic string in a c++ code but I didn't reach anything
Could someone tell me how can I write an arabic string in C++ ?? , I am using Codeblocks.
I searched all over the internet to see how can I put an arabic string in a c++ code but I didn't reach anything
There are multiple types of string literals available for use in C++. While the most common type, the narrow multibyte string literal, looks like this:
auto myliteral = "";
there are others.
specifically, the one you might be looking for is the utf-8 string literal, written like this:
auto myutf8literal = u8"عربي";
all variations on the basic string literal are prepended with some combination of characters:
// ex. description storage
L""; // wide string literal wchar_t[]
u8""; // utf-8 encoded string literal char[]
u""; // utf-16 encoded string literal char16_t[]
U""; // utf-32 encoded string literal char32_t[]
R"delim()delim";
// raw string literal, modifier to indicate no escapes in string
// can be combined with any of the above string literal types
and most of them can hold a far greater variety of characters than the basic string literal. u8
, u
and U
hold the unicode character set, which is a rather massive set of characters.
Something you should be aware of is that displaying this string in environments with minimal display capabilities, rather than simply writing to a file, is going to introduce additional complications. As an example: the Windows console is heavily limited in which characters it can display, and uses something called a code page to decide how to display certain characters. It does have a utf-8 code page, though there appear to be some issues with it.
For details on dealing with Arabic console output on Linux, look at @cup's answer.
The first thing you have to consider is where you are printing the text. If you are printing it on the console, in Linux, use acon. On Windows, change to codepage 1256 and change the font to Lucida Console.
Arabic is a cursive script and is written right to left. When you input the text as an assignment, the text you write is from left to right. So if you have
wchar_t* monday = L"الإثنين";
It is made up of the individual letters ال إ ث ن ي ن (equal to English yadnom). If you try to display it, you may get ال إ ث ن ي نor ن ي ن ث إ ل ا, depending on how it does the printing. Some systems understand right to left, some don't. This can be quite confusing as you don't really want to input "yadnom si yadot" instead of "today is monday". If you put in "today is monday" and get back "yadom si yadot" then you will need to reverse the string internally before printing it out.
The next problem is that if you put in ال إ ث ن ي ن, you may not end up with الإثنين because the system does not know about the joined up writing. If you look further down in the unicode character set, you will find that each letter has four forms.
You need to pick the right form of each letter, depending on what it has on either side. Once you do that, you will get الإثنين Check that the alef is on the right: not the left.
The next problem is where you print it. If you're using a left to right system that does not know anything about right to left scripts, the text has to be measured and positioned correctly before printing.
If you plan to do any justification, remember that Arabic does not increase the space between the words: it increases the length of the words with a special character called the kashida.
It is related to character encoding. Most implementations use UTF-8. See http://utf8everywhere.org/
Some software libraries (e.g. both GTK & Qt) are able to display UTF-8 string with both arabic and latin sentences (e.g. changing directions).
You console has to be in Unicode and I may have some of this wrong because I don't have codeblocks in front of me.
You can use wprintf: http://www.cplusplus.com/reference/cwchar/wprintf/
wprintf(L"Teh Isolated Form: %lc ", L'ﺕ');
Alternatively you may have to use the number for it?
wprintf(L"Teh Isolated Form: %lc ", 65173);