Linux kernel AIO, open system call

2020-07-03 04:09发布

问题:

Why Linux Kernel AIO does not support async 'open' system call? Because 'open' can block on filesystem for long time, cant it?

回答1:

First off, this is a perfectly fine and legitimate question; the downvote was unfortunate, it probably pushed away people more knowledgeable than I am.

AFAICT, there is no good reason. The discussion you managed to dig up is relevant, but not satisfactory at all (which is probably your conclusion as well). Though Torvald's points are technically correct, they clearly dismiss the elephant in the room -- GUI programming -- as well as many other use-cases I'm sure.

  • Yes, network servers will be bound by network latency. It's a bit dubious that it should be a reason not to care about all other IO, but I can accept that.

  • Yes, many server workloads will be able to make use of the dentry/inode cache, but not all, and there will always be misses.

  • Yes, the argument "buy more RAM" works. I've never found it to be a good argument.

  • And then there's all the other use-cases. For many, including GUI programming, it doesn't matter that we block sometimes or a lot; we should never block, ever. If the access patterns are very random and distant in time, buying more RAM won't help either -- short of having as much capacity as what the secondary storage offers.

    The idea that "it should be fast anyway" is also wrong; always consider remote filesystems.

The only compelling point is this:

Short and sweet: "aio_open()" is basically never supposed to be an issue. If it is, you've misdesigned something, or you're trying too damn hard to single-thread everything (and "hiding" the threading that does happen by just calling it "AIO" instead - lying to yourself, in short).

The point here is precisely to avoid threading, so this remark surprised me. The mere fact that the other arguments were even enumerated suggests to me that this one is too fragile to stand on its own.

Digging around in the same discussion, you can find this post by Mikulas Patocka:

You can emulate asynchronous IO with kernel threads like FreeBSD and some commercial Unices do, but you still need as many (possibly kernel) threads as many requests you are servicing.

(...)

Making real async IO would require to rewrite all filesystems and whole VFS _from_scratch_. It won't happen.

http://lkml.iu.edu//hypermail/linux/kernel/0102.1/0074.html

This sounds like a proper explanation, though clearly not a good one.

Keep in mind this is an old thread and a lot has changed since then, so this answer has very little value. However, it provides insight as to why a hypothetical aio_open wasn't available historically. Also, understand that many kernel discussions (or any internal discussion for any project for that matter) usually expect that all participants start with a large set of assumptions. It's thus entirely possible that I'm not looking at this the right way.

That being said, this bit is interesting (Stephen C. Tweedie):

Ahh, but even VMS SYS$QIO is synchronous at doing opens, allocation of the IO request packets, and mapping file location to disk blocks. Only the data IO is ever async (and Ben's async IO stuff for Linux provides that too).

http://lkml.iu.edu//hypermail/linux/kernel/0102.1/0139.html

Why is it interesting? Because it reinforces the notion that a lot of different systems don't implement open (and other calls) asynchronously. Furthermore, aio_open is not specified by POSIX, and I cannot find discussions explaining why. Windows also seems to ignore the issue.

It's as if there was something inherent to the concept that is wrong or difficult, except nobody seems to make a good case for why that is in the end.

My guess is this is simply low-priority, and always has been. Workarounds that include threading or opening files beforehand are supposedly sufficient for enough use-cases that work to provide the functionality could never be justified.

It would be interesting to learn why POSIX doesn't define such a call. I expect an "out of scope" rationale though.

If you want to get to the bottom of this, I suspect you'll have to bring the discussion to more appropriate channels, such as LKML.



回答2:

I wrote a fairly simple yet powerful cpaio C utility that copies a bunch of files using Linux native aio+O_DIRECT (io_submit, io_getevents). I just made it open files as early as possible, queue initial aio reads and only bother to look for the results of the reads once it opened enough files (or all of them if there were few enough). It would have been nice to have a way to async open a file, but in the end it wasn't a big deal.
I've copied tens of TB with this tool.
In the end I think not having async open makes sense. Its a complex operation, the kernel would essentially have to fire a thread to handle it anyways.



标签: linux kernel aio