How can I profile file I/O?

2019-03-14 02:28发布

问题:

Our build is annoyingly slow. It's a Java system built with Ant, and I'm running mine on Windows XP. Depending on the hardware, it can take between 5 to 15 minutes to complete.

Watching overall performance metrics on the machine, as well as correlating hardware differences with build times, indicates that the process is I/O bound. It also shows that the process does a lot more reading than writing.

However, I haven't found a good way to determine which files are being read or written, and how many times. My suspicion is that with our many subprojects and subsequent invocations of the compiler, the build is re-reading the same commonly used libraries many times.

What are some profiling tools that will tell me what a given process is doing with which files? Free is nice, but not essential.


Using Process Monitor, as suggested by Jon Skeet, I was able to confirm my suspicion: almost all of the disk activity was reading and re-reading of libraries, with the JDK's copies of "rt.jar" and other libraries at the top of the list. I can't make a RAM disk large enough to hold all the libraries I used, but mounting the "hottest" libraries on a RAM disk cut build times by about 40%; clearly, Windows file system caching isn't doing a good enough job, even though I've told Windows to optimize for that.

One interesting thing I noticed is that the typical 'read' operation on a JAR file is just a few dozen bytes; usually there are two or three of these, followed by a skip several kilobytes further on in the file. It appeared to be ill-suited to bulk reads.

I'm going to do more testing with all of my third-party libraries on a flash drive, and see what effect that has.

回答1:

If you only need it for Windows, SysInternals Process Monitor should show you everything you need to know. You can select the process, then see each operation as it goes and get a summary of file operation as well.



回答2:

An oldie but a goodie: create a RAM disk and compile your files from there.



回答3:

Back when I still used Windows I used to get good results speeding my build up by having all build output written to a separate partition if maybe 3 GB in size, and periodically formatting that at night once a week via a scheduled task. It's just build output, so it doesn't matter if it gets unilaterally flattened occasionally.

But honestly, since moving to Linux, disk fragmentation is something I never worry about any more.

Another reason to try your build on Linux, at least once, is so that you can run strace (grepped for calls to open) to see what files your build is touching.



回答4:

I used to build a massive Java webapp (JSP frontend) using Ant on Windows and it would take upwards of 3 minutes. I wiped my computer and installed Linux, and suddenly the builds took 18 seconds. Those are real numbers, albeit about 3 years old. I can only assume that Java prefers the Linux memory management and threading models to the Windows equivalents, as all Java programs appear to run better under Linux in my experience (especially Eclipse). Linux seems a lot better about preventing extra reads from the disk when you're doing a lot of reading of files that haven't changed (i.e. exectuables and libraries). This may be a property of the disk cache or the filesystem, I'm not sure which.

One of the great things about Java is that it's cross-platform, so setting up a Linux-based build server is actually an option for you. Being something of a Linux evangelist, I'd of course prefer to see you switch your dev environment to Linux, but I know that a lot of people don't want to do that (or can't for practical reasons).

If you're not willing to even set up a Linux build server to see if it runs faster, you could at least try defragmenting your Windows machine's hard drive. That makes a huge difference for C++ builds on my work computer. Try JkDefrag, which seems a lot better than the defragmenter that comes with Windows.

EDIT: I'd assume I got a downvote because my answer doesn't address the exact question asked. It is, however, in the tradition of StackOverflow to help people fix their real problem, not just treat the symptoms. I'm not one of those people for whom the answer to every question is "use linux". In this instance, however, I have very real, measured performance gains in exactly the situation the OP is asking about, so I thought it worth sharing my experiences.



回答5:

Actually FileMon is a more direct tool than ProcMon. In general, when running performance analysis for disk I/O, consider the following two:

  • Throughput (speed of read/write of bytes per second)
  • Latency (how much in waiting in the queue for read/write)

Once you evaluate the performance of your system in terms of the above, it is easy to identify the bottleneck and take corrective action: get faster disks or change your code (whichever works out cheaper).