Encrypt file using file buffer loop

2019-09-12 06:13发布

Last year I made an encryption program using AES 256 GCM using C++ and the crypto++ lib. This year I wanted to upgrade it to QT and change the way I was reading in the file. The old way was reading the entire file into a char* and then encrypting it and writing it out. I noticed that big files did not work, so I needed to switch this to a buffer.

I switched it to a read 8kb, encrypt, write repeat system, but now every time it loops, it adds an additional 33bytes to the output, and I am not sure why. This means that if the file size < 8KB it works, if the filesize is between 8KB and 16KB the output adds an extra 33bytes, if the filesize is between 16KB and 24KB the output adds an extra 66bytes etc.

What I have been able to figure out so far is it is not the encryption code since it works on files less than 8KB, and it is not the file loop code, since I replaced the encryption code with a simple copy file code, and it copied the file correctly.

I think the problem is I am not resetting a variable and it is somehow messing up the data feed to the encryption code every loop.

here is my code

void encryptfile(double progressbarfilecount, bool& threadstatus) {    

// variables for file data
int buffersize = 8192;
string fullfilename;
string filepath;
string filename;
char memblock[8192];
streampos size;
double filesize;
double encryptedfilesize;
string datastring;
CryptoPP::SecByteBlock initializationvector(32);
string initializationvectorstring;
string cipher;
string encoded;
QMessageBox msgBox;

// encrypt the file
// get the filepath and filename
fullfilename = listbox1->item(progressbarfilecount)->text().toUtf8().constData();
size_t found = fullfilename.find_last_of("/\\");
filepath = fullfilename.substr(0,found);
filename = fullfilename.substr(found + 1);

// get the file size
//QFile myFile(QString::fromStdString(fullfilename));
//filesize = myFile.size();
//myFile.close();
filesize = getfilesize(fullfilename);
 qDebug() << "filesize:" << QString::number(filesize);

// setup the file data
ifstream originalfile(fullfilename, ios::in | ios::binary | ios::ate);
ofstream encryptedfile(fullfilename + ".txt", ios::app);

// get random initializationvector
randomnumber.GenerateBlock(initializationvector, initializationvector.size());

// convert it to a string for the text filee
initializationvectorstring = string((char *)initializationvector.begin(),32);

// check if we should get the checksum of the original file
if (testencryptiontogglebuttonguisetting == "On") {
    originalfilechecksum << checksum(fullfilename);
}



// here is the loop where the problem maybe



// encrypt the file 8KB at a time
for (encryptedfilesize = 0; encryptedfilesize < filesize; encryptedfilesize+= buffersize) {
    // check if the data left to write is less than the buffer size
    if (filesize - encryptedfilesize < buffersize) {
        buffersize = filesize - encryptedfilesize;
        qDebug() << "new buffersize:" << QString::number(buffersize);
    }

    // read the file into a memory block
    originalfile.seekg(encryptedfilesize);
    originalfile.read(memblock, buffersize);

    // convert the memoryblock to readable hexadecimal
    datastring = stringtohexadecimal(string(memblock, buffersize), true);

    // encrypt
    try
    {
    GCM< AES >::Encryption e;
    e.SetKeyWithIV(key, sizeof(key), initializationvector,initializationvector.size());
    // Not required for GCM mode (but required for CCM mode)
    // e.SpecifyDataLengths( adata.size(), pdata.size(), 0 );

    AuthenticatedEncryptionFilter ef(e,new StringSink(cipher), false, TAG_SIZE); // AuthenticatedEncryptionFilter

    // AuthenticatedEncryptionFilter::ChannelPut
    //  defines two channels: "" (empty) and "AAD"
    //   channel "" is encrypted and authenticated
    //   channel "AAD" is authenticated
    ef.ChannelPut("AAD", (const byte*)adata.data(), adata.size());
    ef.ChannelMessageEnd("AAD");

    // Authenticated data *must* be pushed before
    //  Confidential/Authenticated data. Otherwise
    //  we must catch the BadState exception
    ef.ChannelPut("", (const byte*)datastring.data(), datastring.size());
    ef.ChannelMessageEnd("");

    // Pretty print
    StringSource(cipher, true,new HexEncoder(new StringSink(encoded), true, 16, " "));
    }
    catch (CryptoPP::BufferedTransformation::NoChannelSupport&)
    {
    // The tag must go in to the default channel:
    //  "unknown: this object doesn't support multiple channels"
        if (operatingsystem() == "Linux") {
            system("error_message_encrypt_file_error.sh");
        }
        if (operatingsystem() == "Windows") {
            ShellExecute(0, L"open", L"error_message_encrypt_file_error.vbs", 0, 0, SW_NORMAL);
        }
    //msgBox.setText("No Channel Support");
    //msgBox.exec();
    return;
    }
    catch (CryptoPP::AuthenticatedSymmetricCipher::BadState&)
    {
    // Pushing PDATA before ADATA results in:
    //  "GMC/AES: Update was called before State_IVSet"
        if (operatingsystem() == "Linux") {
            system("error_message_encrypt_file_error.sh");
        }
        if (operatingsystem() == "Windows") {
            ShellExecute(0, L"open", L"error_message_encrypt_file_error.vbs", 0, 0, SW_NORMAL);
        }
    //msgBox.setText("Data was read before adata");
    //msgBox.exec();
    return;
    }
    catch (CryptoPP::InvalidArgument&)
    {
        if (operatingsystem() == "Linux") {
            system("error_message_encrypt_file_invalid.sh");
        }
        if (operatingsystem() == "Windows") {
            ShellExecute(0, L"open", L"error_message_encrypt_file_invalid.vbs", 0, 0, SW_NORMAL);
        }
    //msgBox.setText("Invalid Argument");
    //msgBox.exec();
    return;
    }

    // convert the cipher to hexadecimal string
    cipher = stringtohexadecimal(cipher, true);

    // write the encrypted file to a text file with the original file extension
    // check to see if we need to write the initialization vector
    if (encryptedfilesize == 0) {
        initializationvectorstring = stringtohexadecimal(initializationvectorstring, true);
        encryptedfile << initializationvectorstring;
        qDebug() << "wrote the initilization vector";
    }
    encryptedfile << encoded;        
    qDebug() << "encrypted filesize:" << QString::number(encryptedfilesize);

    // clear the variables
    encoded = "";
    cipher = "";
    initializationvectorstring = "";
    keys = "";

}

// close the file data
originalfile.close();
encryptedfile.close();

If anyone could help me figure out what is wrong with the code, I would appreciate it.

1条回答
聊天终结者
2楼-- · 2019-09-12 07:00

Last year I made an encryption program using AES 256 GCM using C++ and the crypto++ lib. This year I wanted to upgrade it to QT and change the way I was reading in the file. The old way was reading the entire file into a char* and then encrypting it and writing it out. I noticed that big files did not work, so I needed to switch this to a buffer...

At the highest levels, you appear to have two design requirements. First, you need to chunk your data while avoiding cipher text expansion. Second, you need to integrate an authenticated encryption scheme.

The extra 16 bytes or so on each loop are due to the authentication tag being added to each encrypted chunk. Believe it or not, this is sometimes a desirable property. For example, image downloading a 4.7 GB Gentoo image and finding out the entire image is corrupt and eventually rejected. Its due to:

for (encryptedfilesize = 0; encryptedfilesize < filesize; encryptedfilesize+= buffersize)
{
    ...
    AuthenticatedEncryptionFilter ef(e,new StringSink(cipher), false, TAG_SIZE); // AuthenticatedEncryptionFilter    
    ...
}

To achieve your goals, I think you are going to need to do two things. First, to answer the question of how to block or chunk the data, you are going to need to Pump your data (as Crypto++ calls it in Pipeline parlance). This has actually been covered previously, but its not readily apparent:

The above handles the blocking or chunking of data in Crypto++. The second issue, how to avoid an authentication tag on each block, has not been asked here (if memory server me correctly).

The answer to the second question can be found at Init-Update-Final on the Crypto++ wiki. The short of it is, don't create a new AuthenticatedEncryptionFilter on each loop iteration. Rather, use a single filter and call MaxRetrievable() to determine if there's any cipher text ready. If there is, then retrieve it as it becomes available. Otherwise, the filter will buffer it indefinitely.

The Init-Update-Final page has an example. Here's how the update function looks. I believe it mostly works as you expect from, say, Java (that's why we called it JavaCipher):

size_t JavaCipher::update(const byte* in, size_t isize, byte* out, size_t osize)
{
    if(in && isize)
        m_filter.get()->Put(in, isize);

    if(!out || !osize || !m_filter.get()->AnyRetrievable())
        return 0;

    size_t t = STDMIN(m_filter.get()->MaxRetrievable(), (word64)osize);
    return m_filter.get()->Get(out, t);
}

When you call final, that's when the authentication tag is generated. While its not readily apparent, the tag is generated in the call to MessageEnd():

size_t JavaCipher::final(byte* out, size_t osize)
{
    m_filter.get()->MessageEnd();

    if(!out || !osize || !m_filter.get()->AnyRetrievable())
        return 0;

    size_t t = STDMIN(m_filter.get()->MaxRetrievable(), (word64)osize);
    return m_filter.get()->Get(out, t);
}

I have not tested this with an authenticated encryption mode like EAX, CCM or GCM. We can work through any issues you experience while updating the wiki page for the benefit of others.

I already know you are going to need to swap-out JavaCiper member StreamTransformationFilter for a AuthenticatedEncryptionFilter for encryption, and an AuthenticatedDecryptionFilter for decryption. Artjom also details some potential issues in his comments.


My apologies for not providing a lot of code. In my mind's eye, your design needs some minor work, so you are not ready for code (yet).

I'm guessing you will be ready for code in your next set of questions (if you ask them here).

查看更多
登录 后发表回答