Consider the program:
public class Test {
public static void main(String[] args) {
if (Arrays.asList(args).contains("--withFoo")) {
use(new Foo());
}
}
static void use(Foo foo) {
// do something with foo
}
}
Is Foo required in the runtime classpath if the program is launched without arguments?
Research
The Java Language Specification is rather vague when Linkage Errors are reported:
This specification allows an implementation flexibility as to when linking activities (and, because of recursion, loading) take place, provided that the semantics of the Java programming language are respected, that a class or interface is completely verified and prepared before it is initialized, and that errors detected during linkage are thrown at a point in the program where some action is taken by the program that might require linkage to the class or interface involved in the error.
My Tests indicate that LinkageErrors are only thrown when I actually use Foo
:
$ rm Foo.class
$ java Test
$ java Test --withFoo
Exception in thread "main" java.lang.NoClassDefFoundError: Foo
at Test.main(Test.java:11)
Caused by: java.lang.ClassNotFoundException: Foo
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 1 more
Can this behaviour be relied upon? Or is there any mainstream JVM that links unused code? If so, how can I isolate unused code such that it is only linked if needed?
You need only small changes to your test code to answer that question.
Change the type hierarchy to
class Bar {}
class Foo extends Bar {}
and the program to
public class Test {
public static void main(String[] args) {
if (Arrays.asList(args).contains("--withFoo")) {
use(new Foo());
}
}
static void use(Bar foo) {
// don't need actual code
}
}
Now, the program will fail with an error, if Foo
is absent, even before entering the main
method (with HotSpot). The reason is that the verifier needs the definition of Foo
to check whether passing it to a method expecting Bar
is valid.
HotSpot takes a short-cut, not loading the type, if the types are an exact match or if the target type is java.lang.Object
, where the assignment is always valid. That's why your original code does not throw early when Foo
is absent.
The bottom line is that the exact point of time when an error is thrown is implementation dependent, e.g. might depend on the actual verifier implementation. All that is guaranteed is, as you already cited, that an attempt to perform an action that requires linkage will throw previously detected linkage errors. But it is perfectly possible that your program never gets so far to make an attempt.
I guess something like this is undefined (sort of, see at the bottom). We know how it works for the oracle VM, but it's an implementation detail of the VM. A VM could also choose to load all classes right away.
Which you can find in the VM spec (emphasis mine):
Linking a class or interface involves verifying and preparing that class or interface, its direct superclass, its direct superinterfaces, and its element type (if it is an array type), if necessary. Resolution of symbolic references in the class or interface is an optional part of linking.
This specification allows an implementation flexibility as to when linking activities (and, because of recursion, loading) take place...
And further down:
The Java Virtual Machine instructions anewarray
, checkcast
, getfield
, getstatic
, instanceof
, invokedynamic
, invokeinterface
, invokespecial
, invokestatic
, invokevirtual
, ldc
, ldc_w
, multianewarray
, new
, putfield
, and putstatic
make symbolic references to the run-time constant pool. Execution of any of these instructions requires resolution of its symbolic reference.
Resolution is the process of dynamically determining concrete values from symbolic references in the run-time constant pool.
the line use(new Foo());
compiles to:
14: new #5 // class Foo
17: dup
18: invokespecial #6 // Method Foo."<init>":()V
21: invokestatic #7 // Method use:(LFoo;)V
So these would require the resolution of Foo
, but nothing else in the program will.
However, it also states (appended to an example, which is why I missed it at first):
Whichever strategy is followed, any error detected during resolution must be thrown at a point in the program that (directly or indirectly) uses a symbolic reference to the class or interface.
So while an error may be found with resolution when the Test
class is loaded, the error will only be thrown when the faulty symbolic reference is actually used.
I have to say that in your circumstances, I'd be sorely tempted to use reflection to create an interface that is always present to bypass the issue entirely. Something along the lines of:
// This may or may not be present
package path.to.foo;
public class Foo implements IFoo {
public void doFooStuff() {
...
}
}
// This is always present
package path.to.my.code;
public interface IFoo {
public void doFooStuff();
}
// Foo may or may not be present at runtime, but this always compiles
package path.to.my.code;
public class Test {
public static void main(String[] args) {
if (Arrays.asList(args).contains("--withFoo")) {
Class<IFoo> fc = Class.forName("path.to.foo.Foo");
IFoo foo = (IFoo)fc.newInstance();
use(foo);
}
}
static void use(IFoo foo) {
// do something with foo
}
}
[EDIT] I know this doesn't directly answer the question, but this seems like a better solution than where you are travelling.