What do “chunk”, “block”, “offset”, “buffer”, and

2020-05-28 08:52发布

问题:

I have seen some of the scripts which are either dealing with archive or binary data or copy files (not using python default functions) use chunk or block or offset or buffer or sector.

I have created a Python application and few of the requirements have been met by external libraries (archival / extracting data) or binaries. I would like to dive deeper now to get those third party library features into my application by writing a module of my own. Now I would like to know what those terms mean and where I can get started. Is there any documentation for the subject above?

Any documentation relevant to those words on the Python programming language would also be appreciated.

回答1:

Chunk is used for any (typically rather large) amount of data which still is only a part of any size of a whole, e. g. the first 1000 bytes of a file. The next 3000 bytes could be the next chunk.

Block is used for a fixed amount of data (typically technically determined) which typically is only part of a whole, e. g. the first 1024 bytes of a file. The next block would then also be 1024 bytes long. Also, sometimes not all of a block is used; the second and last block of a file of 1034 bytes is still 1024 bytes large, but only 10 bytes of it will be in use.

Offset is a positional distance, typically between the beginning of something and the position of interest; e. g. if the 23rd byte in a file of weather data stores the temperature, then the temperature's offset is 23 bytes. It can also be a shift of a data position, e. g. if something has gone wrong and now a file is corrupted, this can be because all bytes are shifted 32 bytes to the back (after inserting 32 zeros at the beginning or similar), then the whole file has an offset of 32 bytes.

Buffer is a piece of memory in which things are collected in order to process them as a whole when the buffer is full (or nearly full). A typical example is buffered output; here single characters are buffered until a line is complete, and then the whole line is printed to the terminal in one write operation. Sometimes buffers have a fixed size, sometimes they just have an upper limit.

Sector is like a block, a fixed size part of a whole, but related even more to a technical origin. The whole in this case often is a piece of hardware (like a hard drive or a CD), and typically sectors contain blocks.