Retrieve method source code from class source code

2019-08-03 04:46发布

I have here a String that contains the source code of a class. Now i have another String that contains the full name of a method in this class. The method name is e.g.

public void (java.lang.String test)

Now I want to retieve the source code of this method from the string with the class' source code. How can I do that? With String#indexOf(methodName) i can find the start of the method source code, but how do i find the end?

====EDIT====

I used the count curly-braces approach:

 internal void retrieveSourceCode()
        {
            int startPosition = parentClass.getSourceCode().IndexOf(this.getName());
            if (startPosition != -1)
            {
                String subCode = parentClass.getSourceCode().Substring(startPosition, parentClass.getSourceCode().Length - startPosition);

                for (int i = 0; i < subCode.Length; i++)
                {
                    String c = subCode.Substring(0, i);
                    int open = c.Split('{').Count() - 1;
                    int close = c.Split('}').Count() - 1;

                    if (open == close && open != 0)
                    {
                        sourceCode = c;
                        break;
                    }
                }

            }
            Console.WriteLine("SourceCode for " + this.getName() + "\n" + sourceCode);
        }

This works more or less fine, However, if a method is defined without body, it fails. Any hints how to solve that?

标签: java parsing
3条回答
闹够了就滚
2楼-- · 2019-08-03 05:35

You will have to, probably, know the sequence of the methods listed in the code file. So that, you can look for the method closing scope } which may be right above start of next method.

So you code might look like:

nStartOfMethod = String.indexOf(methodName)
nStartOfNextMethod = String.indexOf(NextMethodName)

Look for .LastIndexOf(yourMethodTerminator /*probably a}*/,...) between a string of nStartOfMethod and nStartOfNextMethod

In this case, if you dont know the sequence of methods, you might end up skipping a method in between, to find an ending brace.

查看更多
做个烂人
3楼-- · 2019-08-03 05:44

Counting braces and stopping when the count decreases to 0 is indeed the way to go. Of course, you need to take into account braces that appear as literals and should thus not be counted, e.g. braces in comments and strings.

Overall this is kind of a thankless endeavour, comparable in complexity to say, building a command line parser if you want to get it working really reliably. If you know you can get away with it you could cut some corners and just count all the braces, although I do not recommend it.

Update:

Here's some sample code to do the brace counting. As I said, this is a thankless job and there are tons of details you have to get right (in essence, you 're writing a mini-lexer). It's in C#, as this is the closest to Java I can write code in with confidence.

The code below is not complete and probably not 100% correct (for example: verbatim strings in C# do not allow spaces between the @ and the opening quote, but did I know that for a fact or just forgot about it?)

// sourceCode is a string containing all the source file's text
var sourceCode = "...";

// startIndex is the index of the char AFTER the opening brace
// for the method we are interested in
var methodStartIndex = 42;

var openBraces = 1;
var insideLiteralString = false;
var insideVerbatimString = false;
var insideBlockComment = false;
var lastChar = ' '; // White space is ignored by the C# parser,
                    // so a space is a good "neutral" character

for (var i = methodStartIndex; openBraces > 0; ++i) {
    var ch = sourceCode[i];

    switch (ch) {
        case '{':
            if (!insideBlockComment && !insideLiteralString && !insideVerbatimString) {
                ++openBraces;
            }
            break;
        case '}':
            if (!insideBlockComment && !insideLiteralString && !insideVerbatimString) {
                --openBraces;
            }
            break;
        case '"':
            if (insideBlockComment) {
                continue;
            }
            if (insideLiteralString) {
                // "Step out" of the string if this is the closing quote
                insideLiteralString = lastChar != '\';
            }
            else if (insideVerbatimString) {
                // If this quote is part of a two-quote pair, do NOT step out
                // (it means the string contains a literal quote)

                // This can throw, but only for source files with syntax errors
                // I 'm ignoring this possibility here...
                var nextCh = sourceCode[i + 1]; 

                if (nextCh == '"') {
                    ++i; // skip that next quote
                }
                else {
                    insideVerbatimString = false;
                }
            }
            else {
                if (lastChar == '@') {
                    insideVerbatimString = true;
                }
                else {
                    insideLiteralString = true;
                }
            }
            break;
        case '/':
            if (insideLiteralString || insideVerbatimString) {
                continue;
            }

            // TODO: parse this
            // It can start a line comment, if followed by /
            // It can start a block comment, if followed by *
            // It can end a block comment, if preceded by *

            // Line comments are intended to be handled by just incrementing i
            // until you see a CR and/or LF, hence no insideLineComment flag.
            break;
    }

    lastChar = ch;
}

// From the values of methodStartIndex and i we can now do sourceCode.Substring and get the method source
查看更多
【Aperson】
4楼-- · 2019-08-03 05:44

Have a look at:- Parser for C#

It recommends using NRefactory to parse and tokenise source code, you should be able to use that to navigate your class source and pick out methods.

查看更多
登录 后发表回答