Where are ioctl parameters (such as 0x1268 / BLKSS

2019-04-26 10:44发布

问题:

I am looking for a definitive specification describing the expected arguments and behavior of ioctl 0x1268 (BLKSSZGET).

This number is declared in many places (none of which contain a definitive reference source), such as linux/fs.h, but I can find no specification for it.

Surely, somebody at some point in the past decided that 0x1268 would get the physical sector size of a device and documented that somewhere. Where does this information come from and where can I find it?

Edit: I am not asking what BLKSSZGET does in general, nor am I asking what header it is defined in. I am looking for a definitive, standardized source that states what argument types it should take and what its behavior should be for any driver that implements it.

Specifically, I am asking because there appears to be a bug in blkdiscard in util-linux 2.23 (and 2.24) where the sector size is queried in to a uint64_t, but the high 32-bits are untouched since BLKSSZGET appears to expect a 32-bit integer, and this leads to an incorrect sector size, incorrect alignment calculations, and failures in blkdiscard when it should succeed. So before I submit a patch, I need to determine, with absolute certainty, if the problem is that blkdiscard should be using a 32-bit integer, or if the driver implementation in my kernel should be using a 64-bit integer.

Edit 2: Since we're on the topic, the proposed patch presuming blkdiscard is incorrect is:

--- sys-utils/blkdiscard.c-2.23 2013-11-01 18:28:19.270004947 -0400
+++ sys-utils/blkdiscard.c  2013-11-01 18:29:07.334002382 -0400
@@ -71,7 +71,8 @@
 {
    char *path;
    int c, fd, verbose = 0, secure = 0;
-   uint64_t end, blksize, secsize, range[2];
+   uint64_t end, blksize, range[2];
+   uint32_t secsize;
    struct stat sb;

    static const struct option longopts[] = {
@@ -146,8 +147,8 @@
        err(EXIT_FAILURE, _("%s: BLKSSZGET ioctl failed"), path);

    /* align range to the sector size */
-   range[0] = (range[0] + secsize - 1) & ~(secsize - 1);
-   range[1] &= ~(secsize - 1);
+   range[0] = (range[0] + (uint64_t)secsize - 1) & ~((uint64_t)secsize - 1);
+   range[1] &= ~((uint64_t)secsize - 1);

    /* is the range end behind the end of the device ?*/
    end = range[0] + range[1];

Applied to e.g. https://www.kernel.org/pub/linux/utils/util-linux/v2.23/.

回答1:

The answer to "where is this specified?" does seem to be the kernel source.

I asked the question on the kernel mailing list here: https://lkml.org/lkml/2013/11/1/620

In response, Theodore Ts'o wrote (note: he mistakenly identified sys-utils/blkdiscard.c in his list but it's inconsequential):

BLKSSZGET returns an int.  If you look at the sources of util-linux
v2.23, you'll see it passes an int to BLKSSZGET in 

    sys-utils/blkdiscard.c
    lib/blkdev.c

E2fsprogs also expects BLKSSZGET to return an int, and if you look at
the kernel sources, it very clearly returns an int.

The one place it doesn't is in sys-utils/blkdiscard.c, where as you
have noted, it is passing in a uint64 to BLKSSZGET.  This looks like
it's a bug in sys-util/blkdiscard.c.

He then went on to submit a patch¹ to blkdiscard at util-linux:

--- a/sys-utils/blkdiscard.c
+++ b/sys-utils/blkdiscard.c
@@ -70,8 +70,8 @@ static void __attribute__((__noreturn__)) usage(FILE *out)
 int main(int argc, char **argv)
 {
        char *path;
-       int c, fd, verbose = 0, secure = 0;
-       uint64_t end, blksize, secsize, range[2];
+       int c, fd, verbose = 0, secure = 0, secsize;
+       uint64_t end, blksize, range[2];
        struct stat sb;

        static const struct option longopts[] = {

I had been hesitant to mention the blkdiscard tool in both my mailing list post and the original version of this SO question specifically for this reason: I know what's in my kernel's source, it's already easy enough to modify blkdiscard to agree with the source, and this ended up distracting from the real question of "where is this documented?".

So, as for the specifics, somebody more official than me has also stated that the BLKSSZGET ioctl takes an int, but the general question regarding documentation remained. I then followed up with https://lkml.org/lkml/2013/11/3/125 and received another reply from Theodore Ts'o (wiki for credibility) answering the question. He wrote:

> There was a bigger question hidden behind the context there that I'm
> still wondering about: Are these ioctl interfaces specified and
> documented somewhere? From what I've seen, and from your response, the
> implication is that the kernel source *is* the specification, and not
> document exists that the kernel is expected to comply with; is this
> the case?

The kernel source is the specification.  Some of these ioctl are
documented as part of the linux man pages, for which the project home
page is here:

     https://www.kernel.org/doc/man-pages/

However, these document existing practice; if there is a discrepancy
between what is in the kernel has implemented and the Linux man pages,
it is the Linux man pages which are buggy and which will be changed.
That is man pages are descriptive, not perscriptive.

I also asked about the use of "int" in general for public kernel APIs, his response is there although that is off-topic here.

Answer: So, there you have it, the final answer is: The ioctl interfaces are specified by the kernel source itself; there is no document that the kernel adheres to. There is documentation to describe the kernel's implementations of various ioctls, but if there is a mismatch, it is an error in the documentation, not in the kernel.

¹ With all the above in mind, I want to point out that an important difference in the patch Theodore Ts'o submitted, compared to mine, is the use of "int" rather than "uint32_t" -- BLKSSZGET, as per kernel source, does indeed expect an argument that is whatever size "int" is on the platform, not a forced 32-bit value.