Writing a detector to search for uses of “System.o

2019-02-13 22:46发布

问题:

I am trying to write a bug detector to find instances of the method call "System.out.println" using Findbugs.

I understand that "System.out.println" in bytecode is compiled to a call to GETSTATIC, which pushes "System.out" onto the stack. A call to INVOKEVIRTUAL pops "System.out" off the stack and calls the method.

I have prepared some code (found below) which finds the correct GETSTATIC and INVOKEVIRTUAL calls, but have been unable to link the two together. I suspect I may need to use OpcodeStack in some way, but am having trouble in understanding how I can use it. Any help would be appreciated.

    @Override 
    public void sawOpcode(int seen) { 
            // if opcode is getstatic 
            if (seen == GETSTATIC) { 
                    String clsName = getClassConstantOperand(); 
                    if ("java/lang/System".equals(clsName)) { 
                            String fldName = getNameConstantOperand(); 
                            if ("out".equals(fldName)) { 
                                    System.out.println("SYSTEM.OUT here"); 
                            } 
                    } 
            } 

            // if opcode is invokevirtual 
            if (seen == INVOKEVIRTUAL) { 
                    String cls = getDottedClassConstantOperand(); 
                    if ("java.io.PrintStream".equals(cls)) { 
                            String methodName = getNameConstantOperand(); 
                            if ("println".equals(methodName)) { 
                                    bugReporter.reportBug(new BugInstance("SYSTEM_OUT_PRINTLN", 
                                                    NORMAL_PRIORITY).addClassAndMethod(this) 
                                                    .addSourceLine(this)); 
                            } 
                    } 
            } 

    }

回答1:

I found that, for my use case, it is enough to determine that System.out or System.err are used at all - in 99% of cases, these will be used for calling .print or .println later in the block. My detector detects GET_STATIC opcodes that load System.err or System.out. The code is below, showing 3 alternatives of determining that this occurs.

package my.findbugs.detectors.forbiddencalls;

import org.apache.log4j.Logger; // it is not trivial to use a logger with FindBugs in Eclipse, leave it out if there are problems

import my.findbugs.detectors.util.DetectorUtil;

import edu.umd.cs.findbugs.BugInstance;
import edu.umd.cs.findbugs.BugReporter;
import edu.umd.cs.findbugs.bcel.OpcodeStackDetector;
import edu.umd.cs.findbugs.classfile.ClassDescriptor;
import edu.umd.cs.findbugs.classfile.FieldDescriptor;


public class CallToSystemOutPrintlnDetector2 extends OpcodeStackDetector {

    private static final Logger LOGGER = Logger.getLogger(CallToSystemOutPrintlnDetector2.class);
    private BugReporter bugReporter;


    public CallToSystemOutPrintlnDetector2(BugReporter bugReporter) {
        super();
        this.bugReporter = bugReporter;
        LOGGER.debug("Instantiated.");
    }


    public void sawOpcode(int seen) {

        // find occurrences of:  
        //2:   getstatic       #54; //Field java/lang/System.out:Ljava/io/PrintStream;
//2:   getstatic       #54; //Field java/lang/System.out:Ljava/io/PrintStream;

        if (seen == GETSTATIC){

            try {
//              LOGGER.debug(operand); // static java.lang.System.out Ljava/io/PrintStream;
//              LOGGER.debug(operand.getClass()); // class edu.umd.cs.findbugs.classfile.analysis.FieldInfo
//              LOGGER.debug(operand.getName()); // err
//              LOGGER.debug(operand.getClassDescriptor()); // java/lang/System
//              LOGGER.debug(operand.getSignature()); // Ljava/io/PrintStream;

                FieldDescriptor operand = getFieldDescriptorOperand();
                ClassDescriptor classDescriptor = operand.getClassDescriptor();
                if ("java/lang/System".equals(classDescriptor.getClassName()) && 
                        ("err".equals(operand.getName())||"out".equals(operand.getName()))) {
                    reportBug();
                }
            } catch (Exception e) {
                //ignore
            }

            // could be used
//          try {
//              MethodDescriptor operand = getMethodDescriptorOperand();
//              LOGGER.debug(operand); // java.lang.System.outLjava/io/PrintStream;
//              LOGGER.debug(operand.getClass()); // class edu.umd.cs.findbugs.classfile.MethodDescriptor
//              LOGGER.debug(operand.getName()); // err 
//              LOGGER.debug(operand.getClassDescriptor()); // java/lang/System
//              LOGGER.debug(operand.getSignature()); // Ljava/io/PrintStream;
//          } catch (Exception e) {
//              //ignore
//          }

            // could be used
//          try {
//              String operand = getRefConstantOperand();
//              LOGGER.debug(operand); // java.lang.System.out : Ljava.io.PrintStream;
//              if (operand != null && (
//                  operand.startsWith("java.lang.System.out :") || operand.startsWith("java.lang.System.err :"))) {
//                  reportBug();
//              }
//          } catch (Exception e) {
//              //ignore
//          }
        }
    }

    private void reportBug(){
        this.bugReporter.reportBug(getBugInstance());
    }


    private BugInstance getBugInstance() {
        return new BugInstance(this, "MY_CALL_TO_SYSTEM_OUT_BUG", DetectorUtil.MY_PRIORITY)
            .addClassAndMethod(this)
            .addSourceLine(this);
    }

}


回答2:

Your task is a bit more complicated than it seems. A simple case:

System.out.println("abc");

Is translated into a simple bytecode as well:

getstatic   #2; //java/lang/System.out
ldc #3; //String abc
invokevirtual   #4; //Calling java/io/PrintStream.println(String)

However if you are trying to print anything except simple constant/known value it gets harder:

int x = 42;
System.out.println(x + 17);

Will be translated to:

bipush  42
istore_1  //x = 42
getstatic   #2; //java/lang/System.out
iload_1  //x
bipush  17
iadd  //x + 17 on the stack
invokevirtual   #5; //Calling java/io/PrintStream.println(int)

But wait, it can get worse:

System.out.println("x times 27 is " + x * 27);

What? StringBuilder: ?

new #6; //new java/lang/StringBuilder()
dup
invokespecial   #7; //Calling java/lang/StringBuilder()
ldc #8; //String x times 2 is
invokevirtual   #9; //Calling java/lang/StringBuilder.append(String)
iload_1  //x
bipush  27
imul  //x * 27 on the stack
invokevirtual   #10; //Calling java/lang/StringBuilder.append:(int) with 'x * 27' argument
invokevirtual   #11; //Calling java/lang/StringBuilder.toString:()
invokevirtual   #4; //Calling java/io/PrintStream.println(String)

Interestingly, the original code was translated to (which is a known Java 5 (?) optimization):

System.out.println(
  new StringBuilder().
    append("x times 27 is ").
    append(x * 27).
    toString()
  );

Solution

So indeed - you will need a stack and you'll have to keep track of every push/pop operation as defined in bytecode instruction. A lot of work for such a simple task...

But if you go this path, solving the problem is quite simple: when you encounter INVOKEVIRTUAL the top of the stack should contain some value and the value below the top should be a "java/lang/System.out".

That being said I'm 100% sure Findbugs already implemented this and probably you can use some FindBugs API to make your life easier.



回答3:

Use the class OpcodeStack.

When you see a GETSTATIC, and you realize you have 'out', set the user value on the generated OpcodeStack.Item to Boolean.TRUE. to do this do

try {
     //process opcodes
} finally {
    super.sawOpcode(seen);
    if (pseudocode-saw System.out.println) {
        OpcodeStack.Item item = stack.getStackItem(0);
        item.setUserValue(Boolean.TRUE);
}

then when you process the println call, look at the tos, and if the user value is set to Boolean.TRUE, you know you are in the state to report the bug.



回答4:

One solution I came up with in the past finds System.out.println calls where there is "not too much calculation inside the parentheses", i.e. where a maximum of MAX_WILDCARDS instructions lie between them.

I extended ByteCodePatternDetector, my pattern being the following:

ByteCodePattern pattern = new ByteCodePattern();
// as this is a pattern, I cannot specify here that this is supposed to be System.out
pattern.add(new Load(SYSO_FIELD, "sysoResult")); 
pattern.add(new Wild(MAX_WILDCARDS)); 
pattern.add(new Invoke("java.io.PrintStream", "println", "/.*", Invoke.INSTANCE, null).label(LABEL_PRINT));

I later make sure the byte codes for the Load and Invoke operation are the correct ones, and compare the class and field name I retrieve from match.getBindingSet().lookup(...) to make sure it is System.out being called.

I have seen the existing answers, and I consider changing my solution. I just added this for the sake of completeness.