I am trying to convert a Shift_JIS formatted file into UTF-8 format. For this, below is my approach:
- Read Shift_JIS file
- getBytes of each line and convert it to UTF-8
- Create new file and write UTF-8 converted value to it
Issue is that at step 2 conversion is not happening. I am using below code for converting Shift_JIS to UTF-8:
InputStream inputStream = getContentResolver().openInputStream(uri);
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
byte[] b = line.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");
Please let me know if any other information is required.
I have below 2 questions:
1. Is there any other better way (steps) to do this conversion?
2. Why above code snippet is not working for conversion?
Thanks in advance!!!
The answer @VicJordan posted is not correct. When you call getBytes()
, you are getting the raw bytes of the string encoded under your system's native character encoding (which may or may not be UTF-8). Then, you are treating those bytes as if they were encoded in UTF-8, which they might not be.
A more reliable approach would be to read the Shift_JIS file into a Java String. Then, write out the Java String using UTF-8 encoding.
InputStream in = ...
Reader reader = new InputStreamReader(in, "Shift_JIS");
StringBuilder sb = new StringBuilder();
int read;
while ((read = reader.read()) != -1){
sb.append((char)read);
}
reader.close();
String string = sb.toString();
OutputStream out = ...
Writer writer = new OutputStreamWriter(out, "UTF-8");
writer.write(string);
writer.close();
Finally i found the solution. Was doing some very basic mistake. Below code is working perfectly fine:
InputStream inputStream = getContentResolver().openInputStream(uri);
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream, "Shift_JIS"));
byte[] b = line.getBytes();
String value = new String(b, "UTF-8");
If you want to copy inFile (SHift_JIS) to outFile (UTF-8).
try (Reader reader = new InputStreamReader(new FileInputStream(inFile), "Shift_JIS");
Writer writer = new OutputStreamWriter(new FileOutputStream(outFile), "UTF-8")) {
char[] buffer = new char[4096];
int size;
while ((size = reader.read(buffer)) >= 0)
writer.write(buffer, 0, size);
}