How can I print non English characters taken from

2019-07-31 05:09发布

问题:

I am trying to write a simple c++ program.


goal: open an existing text file, take name and surname and save them to name and surname strings. print name and surname and jump to the next line. repeat until the end of file.

I have 2 problems

I am using windows 8.1 and visual studio 2017 with latest update.

main code is below:

#include <stdio.h>
#include <stdlib.h>
#include <string>
#include "stdafx.h"
#include <iostream>
using namespace std;


int main() {
FILE *fPtr;



if ((fPtr = fopen("newStudentsList.txt", "r")) == NULL) {
    cout << "File could not be opened.\n";
    system("pause");
}


else {
    char *name = new char[100];
    char *surname = new char[100];

    rewind(fPtr);

    while (!feof(fPtr)) {

        fscanf(fPtr, "%s\t%s\n", name, surname);
        cout << name << " " << surname << endl;
    }

    system("pause");
}
return 0;
}

In output, i cannot see turkish characters properly. This is my first problem.


My second problem is that I cannot take names and surnames properly, since in text file they are not written with identical tabs or blanks and some people have one name some have two names.


All the files are here


How can I print non English characters?


How can I take names and surnames properly?

回答1:

First of all, don't use C functions in C++ programs. C++ has different features, different abstractions and different libraries. Using C constructs prevents you from using them.

C++ uses streams to read/write to files, memory and string buffers, over the network etc. It has a large number of algorithms that expect a stream and/or iterator as input.

It also has built-in string types that handle both single-byte (std::string), multi-byte (std::wstring), UTF16 (std::u16string) and UTF32 (std::u32string) libraries. You can specify such string literals in your code. It even has a form of type inference with the auto keyword.

C++ still doesn't have a type for UTF8. Programmers should treat UTF8 strings and files as single byte data and use char and std::string to store them. These values should be converted to other codepages or Unicode types as needed.

This means that you shouldn't have to do anything more than this to display the contents of a UTF8 file to the console. The code is taken from the Input/Output with files tutorial:

#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main () {
  string line;
  ifstream myfile ("newStudentsList.txt");
  if (myfile.is_open())
  {
    while ( getline (myfile,line) )
    {
      cout << line << '\n';
    }
    myfile.close();
  }

  else cout << "Unable to open file"; 

  return 0;
}

By default, the console uses the codepage of your system locale. You can change it to the UTF8 codepage by typing :

chcp 65001

Before running your application. UTF8 strings should display correctly assuming the console font includes the correct characters

UPDATE

One can specify UTF8 literals but the storage is still char, eg:

const char* str1 = u8"Hello World";  
const char* str2 = u8"\U0001F607 is O:-)";  
const char*     s3 = u8"