Why does the JVM allow us to name a function start

Identifiers are well defined by The Java Language Specification, Java SE 7 Edition (§3.8)

An identifier is an unlimited-length sequence of Java letters and Java digits, the
first of which must be a Java letter.

As far as I know, since a method name is an identifier, It should be impossible to name a method starting with a digit in java, and javac respects this rule.

So, why does the Java Virtual Machine seem to not respect this rule by allowing us to name a function starting with numbers, in Bytecode?

This simple snippet will actually print the f99() method name and the value of its parameter.

public class Test {
    public static void main(String[] args) {
        Test t = new Test();
        System.out.println(t.f99(100));
    }

    public int f99(int i){
        System.out.println(Thread.currentThread().getStackTrace()[1].getMethodName());
        return i;
    }
}

Compilation and execution:

$ javac Test.java
$ java Test

Output:

f99
100

It is possible to disassemble the code once compiled, and rename all f99 occurences by 99 (with the help of a tool like reJ).

$ java Test

Output:

99
100

So, is the name of the method actually "99"?

标签： java jvm bytecode identifier

2条回答

傲

2楼-- · 2020-03-26 07:37

The Java Language Specification restricts the characters in valid method names so as to help make parsing the Java language unambiguous.

The JVM was designed to be able to support languages other than just Java. As such the restrictions should not be the same; unless we wanted to force all non-Java languages to have the same restrictions. The restrictions chosen for the JVM are the minimal set that permit unambiguous parsing of the method signatures, a format that appears in the JVM spec and not the JLS.

Taken from the JVM Spec

a name must not contain any of the ASCII characters . ; [ / < > :

That is, the following is a valid JVM signatures [Lcom/foo/Bar;, and its special characters have been excluded from method names.

<> was further reserved to separate special JVM methods from application methods, specifically <init> and <clinit>, which are both method names that the JLS does not permit.

0人赞添加讨论(0) 举报

霸刀☆藐视天下

3楼-- · 2020-03-26 07:47

So, is the name of the method actually "99"?

Real programmers don't use parsers, they use sed:

javac Test.java
sed -i 's/\d003f99/\d00299/' Test.class
java Test

Output:

99
100

This works because we know that the method name is stored in the constant pool as plaintext in a Utf8 entry, and JVMS says that Utf8 entries are of form:

CONSTANT_Utf8_info {
    u1 tag;
    u2 length;
    u1 bytes[length];
}

so we had something like:

01 | 00 03 | 'f' '9' '9'

(identifier 3 bytes long) and the sed command replaced 03 | 'f' '9' '9' with 02 | '9' '9' (now 2 bytes long).

I later checked with javap -v Test.class that sed did what I wanted it to do. Before:

#18 = Utf8               f99

After:

#18 = Utf8               99

Having manually edited, run, decompiled and compared the .class to the JVMS, I can only conclude that the method name must be 99 :-)

So it's just a Java language restriction not present in bytecode.

Why does Java prevent such names?

Likely to make the syntax look like C.

Not starting with digits makes it easier to differentiate identifiers from integer literals for both humans and parsers.

Why does the JVM allow us to name a function start

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间