Last year I made an encryption program using AES 256 GCM using C++ and the crypto++ lib. This year I wanted to upgrade it to QT and change the way I was reading in the file. The old way was reading the entire file into a char* and then encrypting it and writing it out. I noticed that big files did not work, so I needed to switch this to a buffer.
I switched it to a read 8kb, encrypt, write repeat system, but now every time it loops, it adds an additional 33bytes to the output, and I am not sure why. This means that if the file size < 8KB it works, if the filesize is between 8KB and 16KB the output adds an extra 33bytes, if the filesize is between 16KB and 24KB the output adds an extra 66bytes etc.
What I have been able to figure out so far is it is not the encryption code since it works on files less than 8KB, and it is not the file loop code, since I replaced the encryption code with a simple copy file code, and it copied the file correctly.
I think the problem is I am not resetting a variable and it is somehow messing up the data feed to the encryption code every loop.
here is my code
void encryptfile(double progressbarfilecount, bool& threadstatus) {
// variables for file data
int buffersize = 8192;
string fullfilename;
string filepath;
string filename;
char memblock[8192];
streampos size;
double filesize;
double encryptedfilesize;
string datastring;
CryptoPP::SecByteBlock initializationvector(32);
string initializationvectorstring;
string cipher;
string encoded;
QMessageBox msgBox;
// encrypt the file
// get the filepath and filename
fullfilename = listbox1->item(progressbarfilecount)->text().toUtf8().constData();
size_t found = fullfilename.find_last_of("/\\");
filepath = fullfilename.substr(0,found);
filename = fullfilename.substr(found + 1);
// get the file size
//QFile myFile(QString::fromStdString(fullfilename));
//filesize = myFile.size();
//myFile.close();
filesize = getfilesize(fullfilename);
qDebug() << "filesize:" << QString::number(filesize);
// setup the file data
ifstream originalfile(fullfilename, ios::in | ios::binary | ios::ate);
ofstream encryptedfile(fullfilename + ".txt", ios::app);
// get random initializationvector
randomnumber.GenerateBlock(initializationvector, initializationvector.size());
// convert it to a string for the text filee
initializationvectorstring = string((char *)initializationvector.begin(),32);
// check if we should get the checksum of the original file
if (testencryptiontogglebuttonguisetting == "On") {
originalfilechecksum << checksum(fullfilename);
}
// here is the loop where the problem maybe
// encrypt the file 8KB at a time
for (encryptedfilesize = 0; encryptedfilesize < filesize; encryptedfilesize+= buffersize) {
// check if the data left to write is less than the buffer size
if (filesize - encryptedfilesize < buffersize) {
buffersize = filesize - encryptedfilesize;
qDebug() << "new buffersize:" << QString::number(buffersize);
}
// read the file into a memory block
originalfile.seekg(encryptedfilesize);
originalfile.read(memblock, buffersize);
// convert the memoryblock to readable hexadecimal
datastring = stringtohexadecimal(string(memblock, buffersize), true);
// encrypt
try
{
GCM< AES >::Encryption e;
e.SetKeyWithIV(key, sizeof(key), initializationvector,initializationvector.size());
// Not required for GCM mode (but required for CCM mode)
// e.SpecifyDataLengths( adata.size(), pdata.size(), 0 );
AuthenticatedEncryptionFilter ef(e,new StringSink(cipher), false, TAG_SIZE); // AuthenticatedEncryptionFilter
// AuthenticatedEncryptionFilter::ChannelPut
// defines two channels: "" (empty) and "AAD"
// channel "" is encrypted and authenticated
// channel "AAD" is authenticated
ef.ChannelPut("AAD", (const byte*)adata.data(), adata.size());
ef.ChannelMessageEnd("AAD");
// Authenticated data *must* be pushed before
// Confidential/Authenticated data. Otherwise
// we must catch the BadState exception
ef.ChannelPut("", (const byte*)datastring.data(), datastring.size());
ef.ChannelMessageEnd("");
// Pretty print
StringSource(cipher, true,new HexEncoder(new StringSink(encoded), true, 16, " "));
}
catch (CryptoPP::BufferedTransformation::NoChannelSupport&)
{
// The tag must go in to the default channel:
// "unknown: this object doesn't support multiple channels"
if (operatingsystem() == "Linux") {
system("error_message_encrypt_file_error.sh");
}
if (operatingsystem() == "Windows") {
ShellExecute(0, L"open", L"error_message_encrypt_file_error.vbs", 0, 0, SW_NORMAL);
}
//msgBox.setText("No Channel Support");
//msgBox.exec();
return;
}
catch (CryptoPP::AuthenticatedSymmetricCipher::BadState&)
{
// Pushing PDATA before ADATA results in:
// "GMC/AES: Update was called before State_IVSet"
if (operatingsystem() == "Linux") {
system("error_message_encrypt_file_error.sh");
}
if (operatingsystem() == "Windows") {
ShellExecute(0, L"open", L"error_message_encrypt_file_error.vbs", 0, 0, SW_NORMAL);
}
//msgBox.setText("Data was read before adata");
//msgBox.exec();
return;
}
catch (CryptoPP::InvalidArgument&)
{
if (operatingsystem() == "Linux") {
system("error_message_encrypt_file_invalid.sh");
}
if (operatingsystem() == "Windows") {
ShellExecute(0, L"open", L"error_message_encrypt_file_invalid.vbs", 0, 0, SW_NORMAL);
}
//msgBox.setText("Invalid Argument");
//msgBox.exec();
return;
}
// convert the cipher to hexadecimal string
cipher = stringtohexadecimal(cipher, true);
// write the encrypted file to a text file with the original file extension
// check to see if we need to write the initialization vector
if (encryptedfilesize == 0) {
initializationvectorstring = stringtohexadecimal(initializationvectorstring, true);
encryptedfile << initializationvectorstring;
qDebug() << "wrote the initilization vector";
}
encryptedfile << encoded;
qDebug() << "encrypted filesize:" << QString::number(encryptedfilesize);
// clear the variables
encoded = "";
cipher = "";
initializationvectorstring = "";
keys = "";
}
// close the file data
originalfile.close();
encryptedfile.close();
If anyone could help me figure out what is wrong with the code, I would appreciate it.
At the highest levels, you appear to have two design requirements. First, you need to chunk your data while avoiding cipher text expansion. Second, you need to integrate an authenticated encryption scheme.
The extra 16 bytes or so on each loop are due to the authentication tag being added to each encrypted chunk. Believe it or not, this is sometimes a desirable property. For example, image downloading a 4.7 GB Gentoo image and finding out the entire image is corrupt and eventually rejected. Its due to:
To achieve your goals, I think you are going to need to do two things. First, to answer the question of how to block or chunk the data, you are going to need to
Pump
your data (as Crypto++ calls it in Pipeline parlance). This has actually been covered previously, but its not readily apparent:The above handles the blocking or chunking of data in Crypto++. The second issue, how to avoid an authentication tag on each block, has not been asked here (if memory server me correctly).
The answer to the second question can be found at Init-Update-Final on the Crypto++ wiki. The short of it is, don't create a new
AuthenticatedEncryptionFilter
on each loop iteration. Rather, use a single filter and callMaxRetrievable()
to determine if there's any cipher text ready. If there is, then retrieve it as it becomes available. Otherwise, the filter will buffer it indefinitely.The Init-Update-Final page has an example. Here's how the
update
function looks. I believe it mostly works as you expect from, say, Java (that's why we called itJavaCipher
):When you call
final
, that's when the authentication tag is generated. While its not readily apparent, the tag is generated in the call toMessageEnd()
:I have not tested this with an authenticated encryption mode like EAX, CCM or GCM. We can work through any issues you experience while updating the wiki page for the benefit of others.
I already know you are going to need to swap-out
JavaCiper
memberStreamTransformationFilter
for aAuthenticatedEncryptionFilter
for encryption, and anAuthenticatedDecryptionFilter
for decryption. Artjom also details some potential issues in his comments.My apologies for not providing a lot of code. In my mind's eye, your design needs some minor work, so you are not ready for code (yet).
I'm guessing you will be ready for code in your next set of questions (if you ask them here).