I am using RSA algorithm for encryption and decryption of a file with size more than rsa key size.
In the code below for encryption, i am reading file content in block-wise and converting into cipher text. Block-size is 32 bytes.
FileInputStream fin1 = new FileInputStream(genfile);
FileOutputStream fout = new FileOutputStream(seedcipher);
byte[] block = new byte[32];
int i;
while ((i = fin1.read(block)) != -1)
{
byte[] inputfile= cipher.doFinal(block);
fout.write(inputfile);
}
fin1.close();
At decryption part, same block-wise decryption is done in the code where i have mentioned the block size as 128 bytes
FileInputStream fin1 = new FileInputStream(encryptedfile);
FileOutputStream fout = new FileOutputStream(seedcipher);
DataInputStream dos =new DataInputStream(fin1);
DataOutputStream dosnew =new DataOutputStream(fout);
byte[] block = new byte[128];
int i;
while ((i = fin1.read(block)) != -1)
{
byte[] inputfile= cipher.doFinal(block);
fout.write(inputfile);
}
Input file size is 81.3 kB and file contains
0
1
2
3
4.....29000
After the file is decrypted,output contain some extra values which are not relevant. why is that extra data in the result?
Your IO code for reading block by block is incorrect:
while ((i = fin1.read(block)) != -1) {
byte[] inputfile= cipher.doFinal(block);
fout.write(inputfile);
}
- It assumes that every time you ask to read a block, a whole block is read. That is not necessarily the case. Only a few bytes might be read. The number of bytes that are actually read are returned by the read() method (and stored in
i
). You should not ignore it.
- The last block has a pretty good chance of being incomplete, unless your file size is a multiple of 32. So at the last iteration, you're encrypting the last N remaining bytes of the file + the 32 - N bytes that were stored in the byte array at the previous iteration.
Using RSA to encrypt a large file is not a good idea. You could for example generate a random AES key, encrypt it using RSA and store it in the output file, and then encrypt the file itself with AES, which is much faster and doesn't have any problem with large inputs. The decryption would read the encrypted AES key, decrypt it, and then decrypt the rest of the file with AES.
Hybrid Cryptosystem
Example: For a 1024 bit key, you can encrypt around 1024 / 8 = 128 bytes
Note: Exact value is 128 bytes - 11 bytes for padding
You can use a symmetric key to encrypt and decrypt the data (> 128 bytes) to be transferred. RSA can only encrypt data up to a certain extent (e.g. 128 bytes) which depends on the RSA key length.
This means that if you want to transfer anything bigger than 128 bytes, you have to transfer a symmetric key < 128 bytes first so you can have the following:
- Generate a symmetric key (< 128 bytes)
- Encrypt symmetric key with RSA
- Transfer encrypted symmetric key
- Decrypt symmetric key with RSA
- Encrypt data (> 128 bytes) with symmetric key
- Transfer encrypted data
- Decrypt encrypted data with symmetric key
or (transfer encrypted symmetric key and encrypted data at the same time)
- Generate a symmetric key (< 128 bytes)
- Encrypt symmetric key with RSA
- Encrypt data (> 128 bytes) with symmetric key
- Transfer encrypted symmetric key & encrypted data
- Decrypt symmetric key with RSA
- Decrypt encrypted data with symmetric key
For more information, click here (Hybrid cryptosystem)
You could solve this by runlength encoding. Use a DataOutputStream to write an integer at the start that represents the amount of bytes that are written afterwards. When decrypting, read that integer, and only use the amount of bytes it says.
I noticed you are using the Cipher class in a wrong way. Use the update
method to add bytes to cipher
and only use doFinal
once. It is important that you use this overloaded version of the update method. Set the inputOffset
parameter to zero and the inputLen
parameter to i
. This will make sure, that the Cipher
is only using the bytes it should.
See JB Nizets answer.