PMD
tells me
A switch with less than 3 branches is inefficient, use a if statement instead.
Why is that? Why 3? How do they define efficiency?
PMD
tells me
A switch with less than 3 branches is inefficient, use a if statement instead.
Why is that? Why 3? How do they define efficiency?
Because a switch
statement is compiled with two special JVM instructions that are lookupswitch
and tableswitch
. They are useful when working with a lot of cases but they cause an overhead when you have just few branches.
An if/else
statement instead is compiled into typical je
jne
... chains which are faster but require many more comparisons when used in a long chain of branches.
You can see the difference by looking at byte code, in any case I wouldn't worry about these issues, if anything could become a problem then JIT will take care of it.
Practical example:
switch (i)
{
case 1: return "Foo";
case 2: return "Baz";
case 3: return "Bar";
default: return null;
}
is compiled into:
L0
LINENUMBER 21 L0
ILOAD 1
TABLESWITCH
1: L1
2: L2
3: L3
default: L4
L1
LINENUMBER 23 L1
FRAME SAME
LDC "Foo"
ARETURN
L2
LINENUMBER 24 L2
FRAME SAME
LDC "Baz"
ARETURN
L3
LINENUMBER 25 L3
FRAME SAME
LDC "Bar"
ARETURN
L4
LINENUMBER 26 L4
FRAME SAME
ACONST_NULL
ARETURN
While
if (i == 1)
return "Foo";
else if (i == 2)
return "Baz";
else if (i == 3)
return "Bar";
else
return null;
is compiled into
L0
LINENUMBER 21 L0
ILOAD 1
ICONST_1
IF_ICMPNE L1
L2
LINENUMBER 22 L2
LDC "Foo"
ARETURN
L1
LINENUMBER 23 L1
FRAME SAME
ILOAD 1
ICONST_2
IF_ICMPNE L3
L4
LINENUMBER 24 L4
LDC "Baz"
ARETURN
L3
LINENUMBER 25 L3
FRAME SAME
ILOAD 1
ICONST_3
IF_ICMPNE L5
L6
LINENUMBER 26 L6
LDC "Bar"
ARETURN
L5
LINENUMBER 28 L5
FRAME SAME
ACONST_NULL
ARETURN
Although there are minor efficiency gains when using a switch compared to using an if-statement, those gains would be negligible under most circumstances. And any source code scanner worth its salt would recognize that micro-optimizations are secondary to code clarity.
They are saying that an if statement is both simpler to read and takes up fewer lines of code than a switch statement if the switch is significantly short.
From the PMD website:
TooFewBranchesForASwitchStatement: Switch statements are indended to be used to support complex branching behaviour. Using a switch for only a few cases is ill-advised, since switches are not as easy to understand as if-then statements. In these cases use theif-then statement to increase code readability.
Why is that?
Different sequences of instructions are used when the code is (finally) compiled to native code by the JIT compiler. A switch is implemented by a sequence of native instructions that perform a indirect branch. (The sequence typically loads an address from a table and then branches to that address.) An if / else is a implemented as instructions that evaluate the condition (probably a compare instruction) followed by a conditional branch instruction.
Why 3?
It is an empirical observation, I assume based on analysing the generated native code instructions and/or benchmarking. (Or possibly not. To be absolutely sure, you would need to ask the author(s) of that PMD rule how they derived that number.)
How do they define efficiency?
Time taken to execute the instructions.
I'd personally take issue with this rule ... or more precisely with the message. I think it should say that an if / else
statement is simpler and more readable than a switch with 2 cases. The efficiency issue is secondary, and probably irrelevant.
I believe it has to do with the way that a switch, and an if/else compiles down.
Say it takes 5 computations to process a switch statement. Say an if statement takes two computations. Less than 3 options in your switch would equal 4 computations in ifs vs 5 in switches. However, the overhead remains constant in a switch, so if it has 3 choices, ifs would be 3 * 2 processed, vs 5 still for the switch.
The gains when looking at millions of computations are extremely negligible. Its more a matter of "this is the better way to do it" rather than anything that might affect you. It would only do so on something that cycles on that function millions of times in a quite iteration.