Sub-process in java are very expensive. Each process is usually support by a NUMBERS of threads.
- a thread to host the process (by JDK 1.6 on linux)
- a thread to read to read/print/ignore the input stream
- another thread to read/print/ignore the error stream
- a more thread to do timeout and monitoring and kill sub-process by your application
- the business logic thread, holduntil for the sub-process return.
The number of thread get out of control if you have a pool of thread focking sub-process to do tasks. As a result, there may be more then a double of concurrent thread at peak.
In many cases, we fork a process just because nobody able to write JNI to call native function missing from the JDK (e.g. chmod, ln, ls), trigger a shell script, etc, etc.
Some thread can be saved, but some thread should run to prevent the worst case (buffer overrun on inputstream).
How can I reduce the overhead of creating sub-process in Java to the minimum? I am thinking of NIO the stream handles, combine and share threads, lower background thread priority, re-use of process. But I have no idea are they possible or not.
Are you able to use JNA to write native calls?
See an answer here How do i programmatically change file permissions? for a good example of chmod.
Much easier than JNI, and much faster than sub-processes!
nio won't work, since when you create a process you can only access the OutputStream, not a Channel.
You can have 1 thread read multiple InputStreams.
Something like,
You can pair the above class with another class that waits for the Processes to complete, something like,
As well, you don't need to have 1 thread for each error stream if you use ProcessBuilder.redirectErrorStream(true), and you don't need 1 thread for reading the process input stream, you can simply ignore the input stream if you are not writing anything to it.
I have created an open source library that allows non-blocking I/O between java and your child processes. The library provides an event-driven callback model. It depends on the JNA library to use platform-specific native APIs, such as epoll on Linux, kqueue/kevent on MacOS X, or IO Completion Ports on Windows.
The project is called NuProcess and can be found here:
https://github.com/brettwooldridge/NuProcess
You don't need any extra threads to run a subprocess in java, although handling timeouts does complicate things a bit:
You could also play with reflection as in Is it possible to read from a InputStream with a timeout? to get a NIO
FileChannel
fromProcess.getInputStream()
, but then you'd have to worry about different JDK versions in exchange for getting rid of the polling.JDK7 will address this issue and provide new API redirectOutput/redirectError in ProcessBuilder to redirect stdout/stderr.
However the bad news is that they forget to provide a "Redirect.toNull" what mean you will want to do something like "if(*nix)/dev/null elsif(win)nil"
Unbeliable that NIO/2 api for Process still missing; but I think redirectOutput+NIO2's AsynchronizeChannel will help.
Since you mention,
chmod
,ln
,ls
, and shell scripts, it sounds like you're trying to use Java for shell programming. If so, you might want to consider a different language that is better suited to that task such as Python, Perl, or Bash. Although it's certainly possible to create subprocesses in Java, interact with them via their standard input/output/error streams, etc., I think you will find a scripting language makes this kind of code less verbose and easier to maintain than Java.