regex to match variable declaration in java

2020-07-10 04:56发布

I want to parse a variable declaration statement and get the variable name. I am doing the below

String var = "private   String   ipaddress;";

i m using the regex pattern below to match the above string

.*private\\s+([a-z]*)\\s+([a-z0-9_]*);

It does not work. It says no match found Can any one please help.

标签: java regex
5条回答
Viruses.
2楼-- · 2020-07-10 05:44

Since the declaration of a variable in Java can have more the 3 words before the variable name, I would suggest you do not limit your search and use this:

String var = "private   String   ipaddress;";
//String var2 = "private static final int test=13;";

Pattern p = Pattern.compile(".+\\s(.+?)(;|=)");
Matcher m = p.matcher(var);

while(m.find()){
    System.out.println(m.group(1));
}

It will look for any variable name that begins with a whitespace and ends with either ";" or "=". This is a more general search of variable name.

EDIT This one got me thinking actually, since this is also legal declaration in Java:

private
static
volatile
String
s , t1 = "";

This actually could be improved probably as it was thinked/done fast.

public static void main(String[] args) {
String var0 = "private static final int test,test2;";
String var1 = "private \n static \n final \n int \n testName \n =\n   5 \n";
String var2 = "private \n static \n final \n String \n testName \n =\n  \" aaa           = bbbb   \" \n";
String var3 = "private \n static \n final \n String \n testName,testName2 \n =\n  \" aaa           = bbbb   \" \n";

String var4 = "int i;";
String var5 = "String s ;";
String var6 = "final String test ;  ";
String var7 = "public int go = 23;";
String var8 = "public static final int value,valu2 ; ";
String var9 = "public static final String t,t1,t2 = \"23\";";
String var10 = "public \n static \n final \n String s1,s2,s3 = \" aaa , bbb, fff, = hhh = , kkk \";";
String var11 = "String myString=\"25\"";

LinkedList<String> input = new LinkedList<String>();
input.add(var0);input.add(var1);input.add(var2);input.add(var3);input.add(var4);input.add(var5);
input.add(var6);input.add(var7);input.add(var8);input.add(var9);input.add(var10);
input.add(var11);

LinkedList<String> result = parametersNames(input);
for(String param: result){
    System.out.println(param);
}

}

private static LinkedList<String> parametersNames(LinkedList<String> input){
LinkedList<String> result = new LinkedList<String>();
for(String var: input){

    if(var.contains("\n")) var = var.replaceAll("\n", "");
    var = var.trim();
    if(var.contains("=")){
        var = var.substring(0, var.indexOf("=")).trim() + "";
        Pattern p = Pattern.compile(".+\\s(.+)$");
        Matcher m = p.matcher(var);

       if(m.find()){
        if(m.group(1).contains(",")){
            String [] tokens = m.group(1).split(",");
            for(String token : tokens){
            result.add(token);
            }
        } else{
            result.add(m.group(1));
        }
        }

    } else{
        Pattern p = Pattern.compile(".+\\s(.+?)(;|=)");
        Matcher m = p.matcher(var);

        if(m.find()){
        if(m.group(1).contains(",")){
            String [] tokens = m.group(1).split(",");
            for(String token : tokens){
            result.add(token);
            }
        } else{
            result.add(m.group(1));
        }
        }
    }
}

return result;
}
查看更多
相关推荐>>
3楼-- · 2020-07-10 05:46

Have a look at Checkstyle regex patterns for naming conventions (types, methods, packages etc). More info here.

查看更多
一夜七次
4楼-- · 2020-07-10 05:50

First of all, remove that dot from the beginning of the regex, since it requires a character before the private for a match.

Second, your regex is case sensitive and won't match the capital s. Either use [a-zA-Z] or make the expression case insensitive ((?i) at the start IIRC).

Btw, [a-zA-Z0-9_] would be the same as \w.

Another thing: your expression would also catch illegal variable names as well as miss legal ones. Variables are not allowed to start with a number but they could also contain dollar signs. Thus the name expression should be something like ([a-zA-Z_$][\w$]*) meaning the first character must be a letter, underscore or dollar sign followed by any number of word characters or dollar signs.

A last note: depending on what you do with those declarations, keep in mind that you might have to check for those reserved words. The adjusted expression would still match "private String private", for example.

Another last note: keep in mind that there might more modifiers than private for a variable, e.g. public, protected, static etc. - or none at all.

Edit:

Now that you have the asterisk after the first dot, that shouldn't be a problem for your special case. However, a dot matches almost any character and thus would match fooprivate as well. Depending on what you want to achieve either remove the dot or add a \s+ after the .*.

查看更多
聊天终结者
5楼-- · 2020-07-10 05:50

.*private\\s+(\\w*)\\s+(\\w*);
use this pattern. [a-z] is a lowercase letter, but "String" in your text starts with uppercase S. \\w is a word character. It's the same as [a-zA-Z0-9_]
It seems that your texts will be like "private <type> <field name>;" and if it's so, your type can contain uppercase lowercase letters, numbers or underlines, so writing \\w is a good solution.

查看更多
爷的心禁止访问
6楼-- · 2020-07-10 05:56

You should use this regex:

^(?s)\\s*private\\s+(\\w+)\\s+(\\w+)\\s*;\\s*$

This will make sure to match:

  • Case insensitive match except keyword private
  • Multi line declarations
  • white spaces at start, end and in the middle
查看更多
登录 后发表回答