可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
An answer (see below) to one of the questions right here on Stack Overflow gave me an idea for a great little piece of software that could be invaluable to coders everywhere.
I'm imagining RAM drive software, but with one crucial difference - it would mirror a real folder on my hard drive. More specifically - the folder which contains the project I'm currently working on. This way any builds would be nearly instantaneous (or at least a couple orders of magnitude faster). The RAM drive would synchronize its contents with the hard disk drive in background using only idle resources.
A quick Google search revealed nothing, but perhaps I just don't know how to Google. Perhaps someone knows of such a software? Preferably free, but reasonable fees might be OK too.
Added: Some solutions have been suggested which I discarded in the very beginning. They would be (in no particular order):
- Buy a faster hard disk drive (SSD maybe or 10K RPM). I don't want a hardware solution. Not only software has the potential to be cheaper (freeware, anyone?), but it can also be used in environments where hardware modifications would be unwelcome if not impossible - say, at the office.
- Let OS/HDD do the caching - it knows better how to use your free RAM. The OS/HDD have generic cache algorithms that cache everything and try to predict which data will be most needed in the future. They have no idea that for me the priority is my project folder. And as we all know quite well - they don't really cache it much anyway. ;)
- There are plenty of RAM drives around; use one of those. Sorry, that would be reckless. I need my data to be synchronized back to the HDD whenever there is a bit of free time. In the case of a power failure I could bear losing the last five minutes of work, but not everything since my last checkin.
Added 2: An idea that came up - use a normal RAM drive plus a background folder synchronizer (but I do mean background). Is there any such thing?
Added 3: Interesting. I just tried out a simple RAM drive at work. The rebuild time drops from ~14 secs to ~7 secs (not bad), but incremental build is still at ~5 secs - just like on the HDD. Any ideas why? It uses aspnet_compiler
and aspnet_merge
. Perhaps they do something with other temp files elsewhere?
Added 4: Oh, nice new set of answers! :) OK, I've got a bit more info for all you naysayers. :)
One of the main reasons for this idea is not the above-mentioned software (14 secs build time), but another one that I didn't have access at the time. This other application has a 100 MB code base, and its full build takes about 5 minutes. Ah yes, it's in Delphi 5, so the compiler isn't too advanced. :) Putting the source on a RAM drive resulted in a BIG difference. I got it below a minute, I think. I haven't measured. So for all those who say that the OS can cache stuff better - I'd beg to differ.
Related Question:
RAM disk for speed up IDE
Note on first link:
The question to which it links has been deleted because it was a duplicate. It asked:
What do you do while your code’s compiling?
And the answer by Dmitri Nesteruk to which I linked was:
I compile almost instantly. Partly due to my projects being small, partly due to the use of RAM disks.
回答1:
In Linux (you never mentioned which OS you're on, so this could be relevant) you can create block devices from RAM and mount them like any other block device (that is, a HDD).
You can then create scripts that copy to and from that drive on start-up / shutdown, as well as periodically.
For example, you could set it up so you had ~/code
and ~/code-real
. Your RAM block gets mounted at ~/code
on startup, and then everything from ~/code-real
(which is on your standard hard drive) gets copied over. On shutdown everything would be copied (rsync'd would be faster) back from ~/code
to ~/code-real
. You would also probably want that script to run periodically, so you didn't lose much work in the event of a power failure, etc.
I don't do this anymore (I used it for Opera when the 9.5 beta was slow, no need anymore).
Here is how to create a RAM disk in Linux.
回答2:
I'm surprised at how many people suggest that the OS can do a better job at figuring out your caching needs than you can in this specialized case. While I didn't do this for compiling, I did do it for similar processes and I ended up using a RAM disk with scripts that automated the synchronization.
In this case, I think I'd go with a modern source control system. At every compile it would check in the source code (along an experimental branch if needed) automatically so that every compile would result in the data being saved off.
To start development, start the RAM disk and pull the current base line. Do the editing, compile, edit, compile, etc. - all the while the edits are being saved for you.
Do the final check in when happy, and you don't even have to involve your regular hard disk drive.
But there are background synchronizers that will automate things - the issue is that they won't be optimized for programming either and may need to do full directory and file scans occasionally to catch changes. A source code control system is designed for exactly this purpose, so it would likely be lower overhead even though it exists in your build setup.
Keep in mind that a background sync task, in the case of a power outage, is undefined. You would end up having to figure out what was saved and what wasn't saved if things went wrong. With a defined save point (at each compile, or forced by hand) you'd have a pretty good idea that it was at least in a state where you thought you could compile it. Use a VCS and you can easily compare it to the previous code and see what changes you've applied already.
回答3:
See Speeding up emerge with tmpfs (Gentoo Linux wiki).
Speeding up compiles using RAM drives under Gentoo was the subject of a how-to written many eons ago. It provides a concrete example of what has been done. The gist is that all source and build intermediate file are redirected to a RAM disk for compile, while final binaries are directed to the hard drive for install.
Also, I recommend exploring maintaining your source on hard drive, but git push
your latest source changes to a clone respository that resides on the RAM disk. Compile the clone. Use your favorite script to copy the binaries created.
I hope that helps.
回答4:
Your OS will cache things in memory as it works. A RAM disk might seem faster, but that's because you aren't factoring in the "copy to RAMDisk" and "copy from RAMDisk" times. Dedicating RAM to a fixed size ramdisk just reduces the memory available for caching. The OS knows better what needs to be in RAM.
回答5:
We used to do this years ago for a 4GL macro-compiler; if you put the macro library and support libraries and your code on a RAM disk, compiling an application (on an 80286) would go from 20 minutes to 30 seconds.
回答6:
I don't have exactly what you're looking for, but I'm now using a combination of Ramdisk and DRAM ramdisk. Since this is Windows, I have a hard 3 GB limit for core memory, meaning I cannot use too much memory for a RAM disk. 4 GB extra on the 9010 really rocks it. I let my IDE store all its temporary stuff on the solid state RAM disk and also the Maven repository. The DRAM RAM disk has a battery backup to the flash card. This sounds like an advertisement, but it really is an excellent setup.
The DRAM disk has double SATA-300 ports and comes out with 0.0 ms average seek on most tests ;) Something for the Christmas stocking?
回答7:
Use https://wiki.archlinux.org/index.php/Ramdisk to make the RAM disk.
Then I wrote these scripts to move directories to and from the RAM disk. Backup is made in a tar file before moving into the RAM disk. The benefit of doing it this way is that the path stays the same, so all your configuration files don't need to change. When you are done, use uramdir
to bring back to disk.
Edit: Added C code that will run any command it is given on an interval in background. I am sending it tar
with --update
to update the archive if any changes.
I believe this general-purpose solution beats making a unique solution to something very simple. KISS
Make sure you change path to rdbackupd
ramdir
#!/bin/bash
# May need some error checking for bad input.
# Convert relative path to absolute
# /bin/pwd gets real path without symbolic link on my system and pwd
# keeps symbolic link. You may need to change it to suit your needs.
somedir=`cd $1; /bin/pwd`;
somedirparent=`dirname $somedir`
# Backup directory
/bin/tar cf $somedir.tar $somedir
# Copy, tried move like https://wiki.archlinux.org/index.php/Ramdisk
# suggests, but I got an error.
mkdir -p /mnt/ramdisk$somedir
/bin/cp -r $somedir /mnt/ramdisk$somedirparent
# Remove directory
/bin/rm -r $somedir
# Create symbolic link. It needs to be in parent of given folder.
/bin/ln -s /mnt/ramdisk$somedir $somedirparent
#Run updater
~/bin/rdbackupd "/bin/tar -uf $somedir.tar $somedir" &
uramdir
#!/bin/bash
#Convert relative path to absolute
#somepath would probably make more sense
# pwd and not /bin/pwd so we get a symbolic path.
somedir=`cd $1; pwd`;
# Remove symbolic link
rm $somedir
# Copy dir back
/bin/cp -r /mnt/ramdisk$somedir $somedir
# Remove from ramdisk
/bin/rm -r /mnt/ramdisk$somedir
# Stop
killall rdbackupd
rdbackupd.cpp
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <signal.h>
#include <sys/time.h>
struct itimerval it;
char* command;
void update_archive(int sig)
{
system(command);
}
int main(int argc, char**argv)
{
it.it_value.tv_sec = 1; // Start right now
it.it_value.tv_usec = 0;
it.it_interval.tv_sec = 60; // Run every 60 seconds
it.it_interval.tv_usec = 0;
if (argc < 2)
{
printf("rdbackupd: Need command to run\n");
return 1;
}
command = argv[1];
signal(SIGALRM, update_archive);
setitimer(ITIMER_REAL, &it, NULL); // Start
while(true);
return 0;
}
回答8:
Profile. Make sure you do good measurements of each option. You can even buy things you've already rejected, measure them, and return them, so you know you're working from good data.
Get a lot of RAM. 2 GB DIMMs are very cheap; 4 GB DIMMs are a little over US$100/ea, but that's still not a lot of money compared to what computer parts cost just a few years ago. Whether you end up with a RAM disk or just letting the OS do its thing, this will help. If you're running 32-bit Windows, you'll need to switch to 64-bit to make use of anything over 3 GB or so.
Live Mesh can synchronize from your local RAM drive to the cloud or to another computer, giving you an up-to-date backup.
Move just compiler outputs. Keep your source code on the real physical disk, but direct .obj, .dll, and .exe files to be created on the RAM drive.
Consider a DVCS. Clone from the real drive to a new repository on the RAM drive. "push" your changes back to the parent often, say every time all your tests pass.
回答9:
I had the same idea and did some research. I found the following tools that do what you are looking for:
- VSuite RAM disk
- DiskBoost
However, the second one I couldn't manage to get working on 64-bit Windows 7 at all, and it doesn't seem to be maintained at the moment.
The VSuite RAM disk on the other hands works very well. Unfortunately I couldn't measure any significant performance boost compared to the SSD disc in place.
回答10:
Yep, I've met the same problem. And after fruitless googling I just wrote a Windows Service for lazy backing up the RAM drive (actually - any folder, because RAM drive can be mounted in to, for example, the desktop).
http://bitbucket.org/xkip/transparentbackup
You can specify interval for full scan (default 5 minutes).
And an interval for scanning only notified files (default 30 seconds).
Scan detects changed files using the 'archive' attribute (the OS resets that one specially for archiving purpose). Only files modified that way are backed up.
The service leaves a special marker file to make sure that target backup is exactly a backup of the source. If the source is empty and does not contain a marker file, the service performs automatic restore from backup. So, you can easily destroy the RAM drive and create it again with automatic data restoration. It is better to use a RAM drive that is able to create a partition on system start up to make it work transparently.
Another solution that I've recently detected is SuperSpeed SuperCache.
This company also has a RAM disk, but that is another software. SuperCache allows you use extra RAM for block-level caching (it is very different from file caching), and another option - mirror you drive to RAM completely. In any scenario you can specify how often to drop dirty blocks back to the hard disk drive, making writes like on the RAM drive, but the mirror scenario also makes reads like from the RAM drive. You can create a small partition, for example, 2 GB (using Windows) and map the entire partition to RAM.
One interesting and very useful thing about that solution - you can change caching and mirroring options any time just instantly with two clicks. For example, if you want your 2 GB back for gamimg or virtual machine - you can just stop mirroring instantly and release memory back. Even opened file handles does not break - the partition continues to work, but as a usual drive.
EDIT: I also highly recommend you move the TEMP folder to te RAM drive, because compilers usually make a lot of work with temp. In my case it gave me another 30% of compilation speed.
回答11:
I wonder if you could build something like a software RAID 1 where you have a physical disk/partition as a member, and a chunk of RAM as a member.
I bet with a bit of tweaking and some really weird configuration one could get Linux to do this. I am not convinced that it would be worth the effort though.
回答12:
There are plenty RAMDrives around, use one of those. Sorry, that would be reckless.
Only if you work entirely in the RAM disc, which is silly..
Psuedo-ish shell script, ramMake:
# setup locations
$ramdrive = /Volumes/ramspace
$project = $HOME/code/someproject
# ..create ram drive..
# sync project directory to RAM drive
rsync -av $project $ramdrive
# build
cd $ramdrive
make
#optional, copy the built data to the project directory:
rsync $ramdrive/build $project/build
That said, your compiler can possibly do this with no additional scripts.. Just change your build output location to a RAM disc, for example in Xcode, it's under Preferences, Building, "Place Build Products in:" and "Place Intermediate Build Files in:".
回答13:
What can be super beneficial on even a single-core machine is parallel make. Disk I/O is a pretty large factor in the build process. Spawning two compiler instances per CPU core can actually increase performance. As one compiler instance blocks on I/O the other one can usually jump into the CPU intensive part of compiling.
You need to make sure you've got the RAM to support this (shouldn't be a problem on a modern workstation), otherwise you'll end up swapping and that defeats the purpose.
On GNU make you can just use -j[n]
where [n]
is the number of simultaneous processes to spawn. Make sure you have your dependency tree right before trying it though or the results can be unpredictable.
Another tool that's really useful (in the parallel make fashion) is distcc. It works a treat with GCC (if you can use GCC or something with a similar command line interface). distcc actually breaks up the compile task by pretending to be the compiler and spawning tasks on remote servers. You call it in the same way as you'd call GCC, and you take advantage of make's -j[n] option to call many distcc processes.
At one of my previous jobs we had a fairly intensive Linux operating system build that was performed almost daily for a while. Adding in a couple of dedicated build machines and putting distcc on a few workstations to accept compile jobs allowed us to bring build times down from a half a day to under 60 minutes for a complete OS + userspace build.
There's a lot of other tools to speed compiles existing. You might want to investigate more than creating RAM disks; something which looks like it will have very little gain since the OS is doing disk caching with RAM. OS designers spend a lot of time getting caching right for most workloads; they are (collectively) smarter than you, so I wouldn't like to try and do better than them.
If you chew up RAM for RAM disk, the OS has less working RAM to cache data and to run your code -> you'll end up with more swapping and worse disk performance than otherwise (note: you should profile this option before completely discarding it).
回答14:
This sounds like disk caching which your operating system and / or your hard drive will handle for you automatically (to varying degrees of performance, admittedly).
My advice is, if you don't like the speed of your drive, buy a high speed drive purely for compiling purposes. Less labor on your part and you might have the solution to your compiling woes.
Since this question was originally asked, spinning hard disks have become miserable tortoises when compared to SSDs. They are very close to the originally requested RAM disk in a SKU that you can purchase from Newegg or Amazon.
回答15:
Some ideas off the top of my head:
Use Sysinternals' Process Monitor (not Process Explorer) to check what goes on during a build - this will let you see if %temp%
is used, for instance (keep in mind that response files are probably created with FILE_ATTRIBUTE_TEMPORARY which should prevent disk writes if possible, though). I've moved my %TEMP%
to a RAM disk, and that gives me minor speedups in general.
Get a RAM disk that supports automatically loading/saving disk images, so you don't have to use boot scripts to do this. Sequential read/write of a single disk image is faster than syncing a lot of small files.
Place your often-used/large header files on the RAM disk, and override your compiler standard paths to use the RAM drive copies. It will likely not give that much of an improvement after first-time builds, though, as the OS caches the standard headers.
Keep your source files on your harddrive, and sync to the RAM disk - not the other way around. Check out MirrorFolder for doing realtime synchronization between folders - it achieves this via a filter driver, so only synchronizes what is necessary (and only does changes - a 4 KB write to a 2 GB file will only cause a 4 KB write to the target folder). Figure out how to make your IDE build from the RAM drive although the source files are on your harddisk... and keep in mind that you'll need a large RAM drive for large projects.
回答16:
The disk slowdown you incur is mainly write, and also possibly due to virus scanners. It can vary greatly between OSes too.
With the idea that writes are slowest, I would be tempted to setup a build where intermediate (for example, .o
files) and binaries get output to a different location such as a RAM drive.
You could then link this bin/intermediate folder to faster media (using a symbolic link or NTFS junction point).
回答17:
My final solution to the problem is vmtouch: https://hoytech.com/vmtouch/
This tool locks the current folder into (ram) cache and vmtouch daemonizes into background.
sudo vmtouch -d -L ./
Put this in shell rc for fast access:
alias cacheThis = 'sudo vmtouch -d -L ./'
I searched for a ready made script for quite a while, because I didn't want to waste a lot of time on writing my own ramdisk-rsync-script. I'm sure I would have missed some edge cases, which would be quite unpleasant if important code was involved. And I never liked the polling approach.
Vmtouch seems like the perfect solution. In addition it doesn't waste memory like a fixed size ramdisk does.
I didn't do a benchmark, because 90% of my 1Gig source+build folder were already cached, but at least it feels faster ;)
回答18:
Just as James Curran says, the fact that most programs follow the law of locality of references, the frequent code and data page count will be narrowed over time to a manageable size by the OS disk cache.
RAM disks were useful when operating systems were built with limitations such as stupid caches (Win 3.x, Win 95, DOS). The RAM disk advantage is near zero and if you assign a lot of RAM it will suck memory available to the system cache manager, hurting overall system performance. The rule of thumb is: let your kernel to do that. This is the same as the "memory defragmentation" or "optimizers" programs: they actually force pages out of cache (so you get more RAM eventually), but causing the system to do a lot of page-faulting over time when your loaded programs begin to ask for code/data that was paged out.
So for more performance, get a fast disk I/O hardware subsystem, maybe RAID, faster CPU, better chipset (no VIA!), more physical RAM, etc.