What exactly makes the JVM (in particular, Sun's implementation) slow to get running compared to other runtimes like CPython? My impression was that it mainly has to do with a boatload of libraries getting loaded whether they're needed or not, but that seems like something that shouldn't take 10 years to fix.
Come to think of it, how does the JVM start time compare to the CLR on Windows? How about Mono's CLR?
UPDATE: I'm particularly concerned with the use case of small utilities chained together as is common in Unix. Is Java now suitable for this style? Whatever startup overhead Java incurs, does it add up for every Java process, or does the overhead only really manifest for the first process?
Just to note some solutions:
There are two mechanisms that allow to faster startup JVM. The first one, is the class data sharing mechanism, that is supported since Java 6 Update 21 (only with the HotSpot Client VM, and only with the serial garbage collector as far as I know)
To activate it you need to set -Xshare (on some implementations: -Xshareclasses ) JVM options.
To read more about the feature you may visit: Class data sharing
The second mechanism is a Java Quick Starter. It allows to preload classes during OS startup, see: Java Quick Starter for more details.
It really depends on what you are doing during the start up. If you run Hello World application it takes 0.15 seconds on my machine.
However, Java is better suited to running as a client or a server/service which means the startup time isn't as important as the connection time (about 0.025 ms) or the round trip time response time (<< 0.001 ms).
Here is what Wikipedia has to say on the issue (with some references).
It appears that most of the time is taken just loading data (classes) from disk (i.e. startup time is I/O bound).
In addition to things already mentioned (loading classes, esp. from compressed JARs); running in interpreted mode before HotSpot compiles commonly-used bytecode; and HotSpot compilation overhead, there is also quite a bit of one-time initialization done by JDK classes themselves. Many optimizations are done in favor of longer-running systems where startup speed is less of a concern.
And as to unix style pipelining: you certainly do NOT want to start and re-start JVM multiple times. That is not going to be efficient. Rather chaining of tools should happen within JVM. This can not be easily intermixed with non-Java Unix tools, except by starting such tools from within JVM.
There are a number of reasons:
jar
s to loadI'm not sure about the CLR, but I think it is often faster because it caches a native version of assemblies for next time (so it doesn't need to JIT). CPython starts faster because it is an interpreter, and IIRC, doesn't do JIT.
All VMs with a rich type system such as Java or CLR will not be instanteous when compared to less rich systems such as those found in C or C++. This is largely because a lot is happening in the VM, a lot of classes get initialized and are required by a running system. Snapshots of an initialized system do help but it still costs to load that image back into memory etc.
A simple hello world styled one liner class with a main still requires a lot to be loaded and initialized. Verifying the class requires a lot of dependency checking and validation all which cost time and many CPU instructions to be executed. On the other hand a C program will not do any of these and will amount of a few instructions and then invoke the printer function.