Simple ASCII compression- Help minimize system cal

2019-06-10 18:58发布

In my last question, nos gave a method of removing the most significant bit from an ASCII character byte, which matches exactly what my professor said when describing the project.

My problem is how to strip the significant bit and pack it into a buffer using read and write commands. Since the write command takes in a length in the number of bytes to write, how do I go deeper to the bit level of the buffer array?

3条回答
▲ chillily
2楼-- · 2019-06-10 19:41

You need to pack the data into a buffer in memory first. For example, to keep it simple:

unsigned char unpacked[128];  // read file input into this buffer
unsigned char packed[128];    // copy from unpacked to here while compressing
                              // then write() this to output file...

To do the compression itself, you need to loop over the number of bytes read into unpacked and use bitwise operators such as & (bitwise AND), | (bitwise OR), << bitwise left shift.

If there are specific parts of this process you don't know how to do, show us your attempt (in code) and we'll give you more details, but you can't expect (or benefit from) people doing all your homework.

查看更多
Root(大扎)
3楼-- · 2019-06-10 19:48

Probably the simplest way to do it is in chunks of eight bytes. Read in a chunk then compress them to seven bytes using bitwise operators.

Let's call the input data input[0..7] and the output data output[0..6].

So, the first byte of the output data, output[0], consists of the lower 7 bits of input[0] plus the second-most upper bit of input[2]. That works the same for all others:

    Index:    [0]      [1]      [2]      [3]      [4]      [5]      [6]      [7]
    Input:  0aaaaaaa 0bbbbbbb 0ccccccc 0ffffdffffdd 0eeeeeee 0fffffff 0ggggggg 0hhhhhhh
            ///////  //////   and     --->
            ||||||| /|||||     so on  --->
    Output: aaaaaaab bbbbbbcc cccccffffd ffffddeeee eeefffff ffgggggg ghhhhhhh
    Index:    [0]      [1]      [2]      [3]      [4]      [5]      [6]

You can use operations like:

output[0] = ((input[0] & 0x7f) << 1) | ((input[1] & 0x40) >> 6)
output[1] = ((input[1] & 0x3f) << 2) | ((input[2] & 0x60) >> 5)
:
output[5] = ((input[5] & 0x03) << 6) | ((input[6] & 0x7e) >> 1)
output[6] = ((input[6] & 0x01) << 7) |  (input[7] & 0x7f)

The others should be calculable from those above. If you want to know more about bitwise operators, see here.

Once you've compressed an eight-byte chunk, write out the seven-byte compressed chunk and keep going.

The only slightly tricky bit is at the end where you may not have a full eight bytes. In that case, you will output as many bytes as you input but the final one will be padded with zero bits.

And, on decompression, you do the opposite. Read in chunks of seven bytes, expand using bitwise operators and write out eight bytes. You can also tell which bits are padding at the end based solely on the size of the last section read in.

查看更多
Deceive 欺骗
4楼-- · 2019-06-10 19:54

As paxdiablo says: the simplest way to do it is in chunks of eight bytes. But why to shift 8 bytes? You can pack in first 7 bytes bits of the last byte! Simple and fast...

Output[0] = ((Input[0] & 0x7f) | (Input[7] & 0x80))         //pack 7th bit in 0th byte
Output[1] = ((Input[1] & 0x7f) | ((Input[7] << 1) & 0x80))  //pack 6th bit in 1th byte
Output[2] = ((Input[2] & 0x7f) | ((Input[7] << 2) & 0x80))  //pack 5th bit in 2th byte
...

For restoring just put together 7th bit off all 7 bytes in to 7th byte and clear the 7th bit in all array bytes.

查看更多
登录 后发表回答