I need to convert a unicoded string to its appropriate language. I need to read from a text file line by line. There is a possibility that a line may contain a unicode some thing like this
\xE6\xAC\xA2\xE8\xBF\x8E
This is basically a chinese text which is equal to
欢迎
Now I need to remove this line (\xE6\xAC\xA2\xE8\xBF\x8E) from text file, convert this unicode to chinese text, append this chinese text to the text file.
Below is the content of my data.txt file:
testing
programming
\xE6\xAC\xA2\xE8\xBF\x8E
development
I would like to get the file content as:
testing
programming
development
欢迎
Below is what I have done so far
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAX 256
int main()
{
int ctr = 0;
char ch;
FILE *fptr1, *fptr2;
char fname[MAX] = "data.txt";
char str[MAX], temp[] = "temp.txt";
char str2[256];
fptr1 = fopen(fname, "r");
if (!fptr1)
{
printf(" File not found or unable to open the input file!!\n");
return 0;
}
fptr2 = fopen(temp, "w"); // open the temporary file in write mode
if (!fptr2)
{
printf("Unable to open a temporary file to write!!\n");
fclose(fptr1);
return 0;
}
// copy all contents to the temporary file except the specific line with unicode characters
while (!feof(fptr1))
{
strcpy(str, "\0");
fgets(str, MAX, fptr1);
if (!feof(fptr1))
{
ctr++;
if(strstr(str,"\\")!=NULL)
{
memset(str2,'\0',sizeof(str2));
printf("Input String Contains Unicode Character\n");
str[strlen(str)-1]='\0';
sprintf(str2,"echo %s >> data.txt",str);
printf("Final String: %s\nUnicode String Size: %ld\n",str2,strlen(str));
system(str2);
}
else
{
fprintf(fptr2, "%s", str);
}
}
}
fclose(fptr1);
fclose(fptr2);
remove(fname); // remove the original file
rename(temp, fname); // rename the temporary file to original name
/*------ Read the file ----------------*/
fptr1=fopen(fname,"r");
ch=fgetc(fptr1);
printf(" Now the content of the file %s is : \n",fname);
while(ch!=EOF)
{
printf("%c",ch);
ch=fgetc(fptr1);
}
fclose(fptr1);
/*------- End of reading ---------------*/
return 0;
}
When tried to compile and run this code, below is the output I am seeing
Input String Contains Unicode Character
Final String: echo \xE6\xAC\xA2\xE8\xBF\x8E >> data.txt
Unicode String Size: 24
Now the content of the file data.txt is :
testing
programming
development
xE6xACxA2xE8xBFx8E
The same code when changed the below lines, it was working as expected
sprintf(str2,"echo %s >> data.txt",str);
sprintf(str2,"echo %s >> data.txt","\xE6\xAC\xA2\xE8\xBF\x8E");
But when the value is read from file it was not working.
Also this line, the string is identified as unicode string with correct size
printf("Final String: %s\nUnicode String Size: %ld\n",str2,strlen(str));
The String Size: 6
Can some one please let me know, how to convert the value to chinese when read from text file.