Jonathan Leffler's comment in the question "How can I find the Size of some specified files?" is thought-provoking. I will break it into parts for analysis.
-- files are stored on pages;
you normally end up with more space being used than that calculation gives because a 1 byte file (often) occupies one page (of maybe 512 bytes).
- The exact values vary - it was easier in the days of the 7th Edition Unix file system (though not trivial even then
4-5. if you wanted to take account of indirect blocks referenced by the inode as well as the raw data blocks).
Questions about the parts
- What is the definition of "page"?
- Why is the word "maybe" in the after-thought "one page (of maybe 512 bytes)"?
- Why was it easier to measure exact sizes in the "7th Edition Unix file system"?
- What is the definition of "indirect block"?
- How can you have references by two things: "the inode" and "the raw data blocks"?
Historical Questions Emerged
I. What is the historical context Leffler is speaking about?
II. Have the definitions changed over time?
As usual for Wikipedia pages, Block (data storage) is informative despite being far too exuberant about linking all keywords.
There's also a reasonable overview of the classical Unix File System.
Traditionally, hard disk geometry (the layout of blocks on the disk itself) has been CHS.
CHS isn't used much these days, as
but for historical reasons, this has affected block sizes: because sector sizes were almost always 512B, filesystem block sizes have always been multiples of 512B. (There is a movement afoot to introduce drives with 1kB and 4kB sector sizes, but compatibility looks rather painful.)
Generally speaking, smaller filesystem block sizes result in less wasted space when storing many small files (unless advanced techniques like tail merging are in use), while larger block sizes reduce external fragmentation and have lower overhead on large disks. The filesystem block size is usually a power of 2, is limited below by the block device's sector size, and is often limited above by the OS's page size.
The page size varies by OS and platform (and, in the case of Linux, can vary by configuration as well). Like block size, smaller block sizes reduce internal fragmentation but require more administrative overhead. 4kB page sizes on 32-bit platforms is common.
Now, on to describe indirect blocks. In the UFS design,
Thus the amount of storage required for a file may be greater than just the blocks containing its data, when indirect pointers are in use.
Not all filesystems use this method for keeping track of the data blocks belong to a file. FAT simply uses a single file allocation table which is effectively a gigantic series of linked lists, and many modern filesystems use extents.
I think he means block instead of page, a block being the minimum addressable unit on the filesystem.
block sizes can vary
Not sure why but perhaps it the filesystem interface exposed api's allowing a more exact measurement.
An indirect block is a block referenced by a pointer
The inode occupies space (blocks) just as the raw data does. This is what the author meant.