Getting a Dev-C++ built program to output UNICODE

2019-06-24 13:30发布

If you can answer any of my questions, that would be awesome.

Here's the scoop: I'm teaching an intro to programming class in Thailand to 11th graders. It's been going great so far, their level of English is high enough that I can teach in English and have them write programs in English and everything is fine and dandy.

However, as speakers of a language with non-Latin characters, I feel that they should at least learn what UNICODE is. I won't test them on it or bog them down with implementation details, but I want to show them an example of a UNICODE program that can do I/O with Thai characters.

I'm operating under the following constraints, none of which can be changed (at least for this semester):

  • The program must run on Windows 7
  • The program must be in C (not C++)
  • We must use Dev-C++ (v. 4.9.9.3) as our IDE (I'm going to try and convince the admins to change for next semester, but they may not want to)
  • The program should output to the Command Line (I'd like it to "look like" the programs we've been writing so far)
  • I want it to be easy to set up and run, though I'm not opposed to including a Batch file to do some setup work for the kids.

Here's how far I've gotten, and the questions I have:

  • In Control Panel > Regions > Administrative > Language for non-UNICODE programs is set to Thai.
  • I used "chcp 874" to set the Thai codepage in the Command Line, but characters from the keyboard come appear as garbage characters. Is this maybe because the keyboard mappings are wrong or do I have to change something else?
  • I wrote a program with the line: printf("\u0E01\n"); which prints ก, the first letter in the Thai alphabet. Is that the right syntax?
  • I received a compiler warning that "Universal Characters are only supported in C++ and C99." Does Dev-C++ not compile to C99? Is there a way I could get a C99 compiler for it?
  • I ran the code and got garbage characters. I imagine this could be because of the compiler, the command line, or any number of other things.

I'd love to end this course with a program that outputs สวัสดีโลก, the Thai equivalent of "Hello World!" I've done tons of googling, but every answer I've found either doesn't work in this specific case or involved a different IDE.

3条回答
Bombasti
2楼-- · 2019-06-24 14:11

If you need to change the code page in a console C program, you can add the header <stdlib.h> and the line system("CHCP 874"); at the beginning of the program.

If you need a free compiler conforming with C99 under windows, you can try Pelles C:

http://www.christian-heffner.de/index.php?page=download&lang=en

It is conforming at all with C99.

You have to use wide-string constants, that have the following syntax:

L"Wide string\n"

Instead of printf(), you need to use wprintf() and the like.

http://pubs.opengroup.org/onlinepubs/7908799/xsh/wchar.h.html

查看更多
家丑人穷心不美
3楼-- · 2019-06-24 14:23

I have never used DEV-C++ IDE :-) However, after reading up on it a bit I see that dev-c++ version 4.9.9.3 uses gcc-3.5.4 mingw port. Which has universal character support status of "Done" see http://gcc.gnu.org/gcc-3.4/c99status.html for details. You have to change the IDE configuration such that the compiler uses -std=c99 as part of the compiler flags.

Hopefully that will do the trick.

I will try to fiddle with it on my own system and see how far we can get. Will update the answer if I find more clues :-)

查看更多
Explosion°爆炸
4楼-- · 2019-06-24 14:35

Ok, here's my bit of help. I don't use Dev-C++ as my IDE, so I can't help you with IDE specific things, but the following is standard to most C/C++ compilers:

wprintf is the printf implementation for wide characters (unicode).

When using wide characters you will use wchar_t in place of char for defining strings.

so you might do something like this

#include <wchar.h>

int main(int argc, char** argv) {
    wchar_t* str = L"สวัสดีโลก";
    wprintf(L"%s", str);
    system("pause");
    return 0;
}

wprintf is most likely what you're looking for.

Other functions for printing and manipulating wide strings can be found by researching the wchar.h header file.

Reference: wprintf - C++ Reference

Using L before the quotations means you intend to define a wide string. (unicode)

Hope that helps,

-Dave

查看更多
登录 后发表回答