How to convert a Binary String to UTF-8 String in

2019-06-10 09:53发布

问题:

i am doing a project which requires me to convert UTF-8 string stored in a windows text file into a continuous binary string and store it in a windows text file. and then read this binary string and convert it back to the original UTF-8 String and store it in a text file. i converted the UTF-8 string to Binarystring but have no idea how to reverse the process.

here's my program to convert UTF-8 String to Binary strings.

package filetobits;

import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class FileToBits {

    public static void main(String[] args) throws IOException, FileNotFoundException {

        FileReader inputStream = new FileReader("C:\\FileTesting\\UTF8.txt");
        FileWriter outputStream = new FileWriter("C:\\FileTesting\\BinaryStrings.txt");

        int c;

        while ((c = inputStream.read()) != -1) {

            outputStream.write(Integer.toBinaryString(c));
            outputStream.write(System.lineSeparator());
        }
        inputStream.close();
        outputStream.close();
    }
}

here's my input(16 characters):

¼¹¨'I.p

here's my output:

1111111111111101 10000 1111111111111101 11111 1111111111111101 100111 1001001 101110 1110000 111100 1111111111111101 1100001 101100 101001 1111111111111101 1111111111111101

i need help converting these binary strings back to a single UTF-8 String and store it in a text file.

i achieved what i want with the following code:

    String str = "";
    FileReader inputStream = new FileReader("C:\\FileTesting\\Encrypted.txt");
    FileWriter outputStream = new FileWriter("C:\\FileTesting\\EncryptedBin.txt");
int c;
while ((c  = inputStream.read()) != -1) {
String s = String.format("%16s", Integer.toBinaryString(c)).replace(' ', '0');
for (int i = 0; i < s.length() / 16; i++) {
int a = Integer.parseInt(s.substring(16 * i, (i + 1) * 16), 2);
str += (char) (a);
    }
   }

But the problem is i cant add extra 0's to make every binary string to a length of 16, because i need to store this data in a image(for my image steganography project). so the shorter the binary string the better.

i need to get the same output produced by the above code but without converting every binary string to a length of 16.

PS: i am kinda lost when it comes to character encodings. is storing UTF-8 characters in a windows txt file convert them to ANSI or something?

回答1:

a byte has 8 bits. in a first step, ignore the UTF-8 issue, just fill a byte[] with the data from your binary string.

When you have a byte[] data, you can use new String(d) to create an UTF-8 String (Java Strings are UTF-8 be default).