Calculate the size to a Base64 encoded message

2020-02-09 00:21发布

I have a binary string that I am encoding in Base 64. Now, I need to know before hand the size of the final Base 64 encoded string will be.

Is there any way to calculate that?

Something like:

BinaryStringSize is 64Kb EncodedBinaryStringSize will be 127Kb after encoding.

Oh, the code is in C.

Thanks.

标签: c base64
9条回答
贪生不怕死
2楼-- · 2020-02-09 00:56

If you do Base64 exactly right, and that includes padding the end with = characters, and you break it up with a CR LF every 72 characters, the answer can be found with:

code_size    = ((input_size * 4) / 3);
padding_size = (input_size % 3) ? (3 - (input_size % 3)) : 0;
crlfs_size   = 2 + (2 * (code_size + padding_size) / 72);
total_size   = code_size + padding_size + crlfs_size;

In C, you may also terminate with a \0-byte, so there'll be an extra byte there, and you may want to length-check at the end of every code as you write them, so if you're just looking for what you pass to malloc(), you might actually prefer a version that wastes a few bytes, in order to make the coding simpler:

output_size = ((input_size * 4) / 3) + (input_size / 96) + 6;
查看更多
家丑人穷心不美
3楼-- · 2020-02-09 00:59

geocar's answer was close, but could sometimes be off slightly.

There are 4 bytes output for every 3 bytes of input. If the input size is not a multiple of three, we must add to make it one. Otherwise leave it alone.

input_size + ( (input_size % 3) ? (3 - (input_size % 3)) : 0) 

Divide this by 3, then multiply by 4. That is our total output size, including padding.

code_padded_size = ((input_size + ( (input_size % 3) ? (3 - (input_size % 3)) : 0) ) / 3) * 4

As I said in my comment, the total size must be divided by the line width before doubling to properly account for the last line. Otherwise the number of CRLF characters will be overestimated. I am also assuming there will only be a CRLF pair if the line is 72 characters. This includes the last line, but not if it is under 72 characters.

newline_size = ((code_padded_size) / 72) * 2

So put it all together:

unsigned int code_padded_size = ((input_size + ( (input_size % 3) ? (3 - (input_size % 3)) : 0) ) / 3) * 4;
unsigned int newline_size = ((code_padded_size) / 72) * 2;

unsigned int total_size = code_padded_size + newline_size;

Or to make it a bit more readable:

unsigned int adjustment = ( (input_size % 3) ? (3 - (input_size % 3)) : 0);
unsigned int code_padded_size = ( (input_size + adjustment) / 3) * 4;
unsigned int newline_size = ((code_padded_size) / 72) * 2;

unsigned int total_size = code_padded_size + newline_size;
查看更多
【Aperson】
4楼-- · 2020-02-09 01:00

I ran into a similar situation in python, and using codecs.iterencode(text, "base64") the correct calculation was:

adjustment = 3 - (input_size % 3) if (input_size % 3) else 0
code_padded_size = ( (input_size + adjustment) / 3) * 4
newline_size = ((code_padded_size) / 76) * 1
return code_padded_size + newline_size
查看更多
登录 后发表回答