The “Why” behind PMD's rules

2019-02-05 15:35发布

问题:

Is there a good resource which describes the "why" behind PMD rule sets? PMD's site has the "what" - what each rule does - but it doesn't describe why PMD has that rule and why ignoring that rule can get you in trouble in the real world. In particular, I'm interested in knowing why PMD has the AvoidInstantiatingObjectsInLoops and OnlyOneReturn rules (the first seems necessary if you need to create a new object corresponding to each object in a collection, the second seems like it is a necessity in many cases that return a value based on some criteria), but what I'm really after is a link somewhere describing the "why" behind a majority of PMD's rules, since this comes up often enough.

Just to be clear, I know that I can disable these and how to do that, I'm just wondering why they are there in the first place. Sorry if there's something obvious I missed out there, but I did a Google search and SO search before posting this. I also understand that these issues are often a matter of "taste" - what I'm looking for is what the argument for the rules are and what alternatives there are. To give a concrete example, how are you supposed to implement one object corresponding to every object in a loop (which is a common operation in Java) without instantiating each object in a loop?

回答1:

In each case, the rule can be a matter of specific circumstances or just "taste".

Instantiating an Object in a loop should be avoided if there are a large number of iterations and the instantiation is expensive. If you can move the code out of the loop, you will avoid many object instantiations, and therefore improve performance. Having said that, this isn't always possible, and in some cases it just doesn't matter to the overall performance of the code. In these cases, do whichever is clearer.

For OnlyOneReturn, there are several ways to view this (with vehement supporters behind each), but they all basically boil down to taste.

For your example, the OnlyOneReturn proponents want code like:

public int performAction(String input) {
    int result;
    if (input.equals("bob")) {
        result = 1;
    } else {
        result = 2;
    }
    return result;
}

Rather than:

public int performAction(String input) {
    if (input.equals("bob")) {
        return 1;
    } else {
        return 2;
    }
}

As you can see, the additional clarity of ReturnOnlyOnce can be debated.

Also see this SO question that relates to instantiation within loops.



回答2:

This article, A Comparison of Bug Finding Tools for Java, "by Nick Rutar, Christian Almazan, and Jeff Foster, compares several bug checkers for Java..."—FindBugs Documents and Publications. PMD is seen to be rather more verbose.

Addendum: As the authors suggest,

"all of the tools choose different tradeoffs between generating false positives and false negatives."

In particular, AvoidInstantiatingObjectsInLoops may not be a bug at all if that is the intent. It's included to help Avoid creating unnecessary objects. Likewise OnlyOneReturn is suggestive in nature. Multiple returns represent a form of goto, sometimes considered harmful, but reasonably used to improve readability.

My pet peeve is people who mandate the use of such tools without understanding the notion of false positives.

As noted here, more recent versions of PMD support improved customization when integrated into the build process.



回答3:

You can look at the PMD-homepage, the rules are explained here in detail and often with a why. The site is structured for the rules-groups, here the link to basic-rules: http://pmd.sourceforge.net/rules/basic.html



回答4:

Each rule is in a PMD Rule Set, which can give you a clue to the reasoning behind the rule (if it isn't explained in detail on the Rule Set page itself).

In the case of AvoidInstantiatingObjectsInLoops, it can be expensive to instantiate a similar object again and again. However it is frequently necessary. On my own project, I have disable this rule, since it is flagging too many false positives.

In the case of OnlyOneReturn, note that it is in a Rule Set called Controversial, which is a hint that these rules are debatable, and depend on the case. I have disabled this entire Rule Set as well.