RAM drive for compiling - is there such a thing?

2019-01-10 06:41发布

An answer (see below) to one of the questions right here on Stack Overflow gave me an idea for a great little piece of software that could be invaluable to coders everywhere.

I'm imagining RAM drive software, but with one crucial difference - it would mirror a real folder on my hard drive. More specifically - the folder which contains the project I'm currently working on. This way any builds would be nearly instantaneous (or at least a couple orders of magnitude faster). The RAM drive would synchronize its contents with the hard disk drive in background using only idle resources.

A quick Google search revealed nothing, but perhaps I just don't know how to Google. Perhaps someone knows of such a software? Preferably free, but reasonable fees might be OK too.

Added: Some solutions have been suggested which I discarded in the very beginning. They would be (in no particular order):

  • Buy a faster hard disk drive (SSD maybe or 10K RPM). I don't want a hardware solution. Not only software has the potential to be cheaper (freeware, anyone?), but it can also be used in environments where hardware modifications would be unwelcome if not impossible - say, at the office.
  • Let OS/HDD do the caching - it knows better how to use your free RAM. The OS/HDD have generic cache algorithms that cache everything and try to predict which data will be most needed in the future. They have no idea that for me the priority is my project folder. And as we all know quite well - they don't really cache it much anyway. ;)
  • There are plenty of RAM drives around; use one of those. Sorry, that would be reckless. I need my data to be synchronized back to the HDD whenever there is a bit of free time. In the case of a power failure I could bear losing the last five minutes of work, but not everything since my last checkin.

Added 2: An idea that came up - use a normal RAM drive plus a background folder synchronizer (but I do mean background). Is there any such thing?

Added 3: Interesting. I just tried out a simple RAM drive at work. The rebuild time drops from ~14 secs to ~7 secs (not bad), but incremental build is still at ~5 secs - just like on the HDD. Any ideas why? It uses aspnet_compiler and aspnet_merge. Perhaps they do something with other temp files elsewhere?

Added 4: Oh, nice new set of answers! :) OK, I've got a bit more info for all you naysayers. :)

One of the main reasons for this idea is not the above-mentioned software (14 secs build time), but another one that I didn't have access at the time. This other application has a 100 MB code base, and its full build takes about 5 minutes. Ah yes, it's in Delphi 5, so the compiler isn't too advanced. :) Putting the source on a RAM drive resulted in a BIG difference. I got it below a minute, I think. I haven't measured. So for all those who say that the OS can cache stuff better - I'd beg to differ.

Related Question:

RAM disk for speed up IDE

Note on first link: The question to which it links has been deleted because it was a duplicate. It asked:

What do you do while your code’s compiling?

And the answer by Dmitri Nesteruk to which I linked was:

I compile almost instantly. Partly due to my projects being small, partly due to the use of RAM disks.

18条回答
在下西门庆
2楼-- · 2019-01-10 06:59

Yep, I've met the same problem. And after fruitless googling I just wrote a Windows Service for lazy backing up the RAM drive (actually - any folder, because RAM drive can be mounted in to, for example, the desktop).

http://bitbucket.org/xkip/transparentbackup You can specify interval for full scan (default 5 minutes). And an interval for scanning only notified files (default 30 seconds). Scan detects changed files using the 'archive' attribute (the OS resets that one specially for archiving purpose). Only files modified that way are backed up.

The service leaves a special marker file to make sure that target backup is exactly a backup of the source. If the source is empty and does not contain a marker file, the service performs automatic restore from backup. So, you can easily destroy the RAM drive and create it again with automatic data restoration. It is better to use a RAM drive that is able to create a partition on system start up to make it work transparently.

Another solution that I've recently detected is SuperSpeed SuperCache.

This company also has a RAM disk, but that is another software. SuperCache allows you use extra RAM for block-level caching (it is very different from file caching), and another option - mirror you drive to RAM completely. In any scenario you can specify how often to drop dirty blocks back to the hard disk drive, making writes like on the RAM drive, but the mirror scenario also makes reads like from the RAM drive. You can create a small partition, for example, 2 GB (using Windows) and map the entire partition to RAM.

One interesting and very useful thing about that solution - you can change caching and mirroring options any time just instantly with two clicks. For example, if you want your 2 GB back for gamimg or virtual machine - you can just stop mirroring instantly and release memory back. Even opened file handles does not break - the partition continues to work, but as a usual drive.

EDIT: I also highly recommend you move the TEMP folder to te RAM drive, because compilers usually make a lot of work with temp. In my case it gave me another 30% of compilation speed.

查看更多
狗以群分
3楼-- · 2019-01-10 07:02

There are plenty RAMDrives around, use one of those. Sorry, that would be reckless.

Only if you work entirely in the RAM disc, which is silly..

Psuedo-ish shell script, ramMake:

# setup locations
$ramdrive = /Volumes/ramspace
$project = $HOME/code/someproject

# ..create ram drive..

# sync project directory to RAM drive
rsync -av $project $ramdrive

# build
cd $ramdrive
make

#optional, copy the built data to the project directory:
rsync $ramdrive/build $project/build

That said, your compiler can possibly do this with no additional scripts.. Just change your build output location to a RAM disc, for example in Xcode, it's under Preferences, Building, "Place Build Products in:" and "Place Intermediate Build Files in:".

查看更多
时光不老,我们不散
4楼-- · 2019-01-10 07:03
  1. Profile. Make sure you do good measurements of each option. You can even buy things you've already rejected, measure them, and return them, so you know you're working from good data.

  2. Get a lot of RAM. 2 GB DIMMs are very cheap; 4 GB DIMMs are a little over US$100/ea, but that's still not a lot of money compared to what computer parts cost just a few years ago. Whether you end up with a RAM disk or just letting the OS do its thing, this will help. If you're running 32-bit Windows, you'll need to switch to 64-bit to make use of anything over 3 GB or so.

  3. Live Mesh can synchronize from your local RAM drive to the cloud or to another computer, giving you an up-to-date backup.

  4. Move just compiler outputs. Keep your source code on the real physical disk, but direct .obj, .dll, and .exe files to be created on the RAM drive.

  5. Consider a DVCS. Clone from the real drive to a new repository on the RAM drive. "push" your changes back to the parent often, say every time all your tests pass.

查看更多
乱世女痞
5楼-- · 2019-01-10 07:03

Some ideas off the top of my head:

Use Sysinternals' Process Monitor (not Process Explorer) to check what goes on during a build - this will let you see if %temp% is used, for instance (keep in mind that response files are probably created with FILE_ATTRIBUTE_TEMPORARY which should prevent disk writes if possible, though). I've moved my %TEMP% to a RAM disk, and that gives me minor speedups in general.

Get a RAM disk that supports automatically loading/saving disk images, so you don't have to use boot scripts to do this. Sequential read/write of a single disk image is faster than syncing a lot of small files.

Place your often-used/large header files on the RAM disk, and override your compiler standard paths to use the RAM drive copies. It will likely not give that much of an improvement after first-time builds, though, as the OS caches the standard headers.

Keep your source files on your harddrive, and sync to the RAM disk - not the other way around. Check out MirrorFolder for doing realtime synchronization between folders - it achieves this via a filter driver, so only synchronizes what is necessary (and only does changes - a 4 KB write to a 2 GB file will only cause a 4 KB write to the target folder). Figure out how to make your IDE build from the RAM drive although the source files are on your harddisk... and keep in mind that you'll need a large RAM drive for large projects.

查看更多
The star\"
6楼-- · 2019-01-10 07:04

I had the same idea and did some research. I found the following tools that do what you are looking for:

However, the second one I couldn't manage to get working on 64-bit Windows 7 at all, and it doesn't seem to be maintained at the moment.

The VSuite RAM disk on the other hands works very well. Unfortunately I couldn't measure any significant performance boost compared to the SSD disc in place.

查看更多
Bombasti
7楼-- · 2019-01-10 07:04

What can be super beneficial on even a single-core machine is parallel make. Disk I/O is a pretty large factor in the build process. Spawning two compiler instances per CPU core can actually increase performance. As one compiler instance blocks on I/O the other one can usually jump into the CPU intensive part of compiling.

You need to make sure you've got the RAM to support this (shouldn't be a problem on a modern workstation), otherwise you'll end up swapping and that defeats the purpose.

On GNU make you can just use -j[n] where [n] is the number of simultaneous processes to spawn. Make sure you have your dependency tree right before trying it though or the results can be unpredictable.

Another tool that's really useful (in the parallel make fashion) is distcc. It works a treat with GCC (if you can use GCC or something with a similar command line interface). distcc actually breaks up the compile task by pretending to be the compiler and spawning tasks on remote servers. You call it in the same way as you'd call GCC, and you take advantage of make's -j[n] option to call many distcc processes.

At one of my previous jobs we had a fairly intensive Linux operating system build that was performed almost daily for a while. Adding in a couple of dedicated build machines and putting distcc on a few workstations to accept compile jobs allowed us to bring build times down from a half a day to under 60 minutes for a complete OS + userspace build.

There's a lot of other tools to speed compiles existing. You might want to investigate more than creating RAM disks; something which looks like it will have very little gain since the OS is doing disk caching with RAM. OS designers spend a lot of time getting caching right for most workloads; they are (collectively) smarter than you, so I wouldn't like to try and do better than them.

If you chew up RAM for RAM disk, the OS has less working RAM to cache data and to run your code -> you'll end up with more swapping and worse disk performance than otherwise (note: you should profile this option before completely discarding it).

查看更多
登录 后发表回答