How can JVM implementations like Jython and JRuby

2019-02-18 09:09发布

问题:

I was watching this video here, where Robert Nicholson discusses P8, an implementation of PHP on the JVM. At some point he mentions that they aim to surpass native PHP in performance some time in the future.

He mentions JRuby and Jython, which started out slower than their native counterparts, but eventually surpassed them. Quercus, another PHP interpreter on the JVM claims to be 4x faster than mod_php and is also worth of note.

Does that mean that the general idea that the JVM is slower than C is wrong, or are there flaws in the original C implementations?

回答1:

Does that mean that the general idea that the JVM is slower than C is wrong, or are there >flaws in the original C implementations?

A bit of both

The JVM has been around for a long time and has made significant progress in efficiency. The garbage collection, jitting, caching and other areas are more advanced than in 'reference' implementations such as PHP.

Anyone taking a look under the hood of PHP will understand why efficiency gains are easy to achieve.

I am personally doubtful that the JVM can outperform the CPython however ... but I could be wrong ... I am, this is down to the JVM GC being faster, and IronPython too. Performance improvements may be a non-reliance on the C call stack such as in stackless Python. The Jython site states

Jython is approximately as fast as CPython--sometimes faster, sometimes slower. Because >most JVMs--certainly the fastest ones--do long running, hot code will run faster overtime.

Which I can appricate as fact as the JVM will reach C performance levels as caches generate and so on basically negate the higher level aspects to the VM implementation code (a large part of which is written in C anyway)

In many interpreted languages such as PHP and Python are just bridges to equivalent C calls and dives into machine code. In the JVM, the Jitter performs a similar function by reducing the bytecode to machine-code equivalents. Eventually, the intermediate representations such as the high-level syntax and bytecode are usually reduced to C-speed or faster CPU operations anyway ... so it is all the same, just more intermediate steps which only affects the latency to full efficiency when loading new code. There comes a point in RAM where you say "what is the real difference?" and the answer is only the process that gets it there and the final representation that determines the speed of stack winding, garbage collection algorithms, register usage and logic representation such as arithmetic.



回答2:

It's not too hard. If you write your implementation in C you have to write your own GC, JIT and more (to be fast and efficient). To do that really good you need really smart people with a lot of experience and give them a lot of time.

I will go out on a limb here and say that the current implementation of PHP (not based not on the knowledge of the inner working but rather on the benchmarks I have seen and on stuff people who know more about PHP told me) is not state of the art. Facebook tries to address this but they do it in a uncommon way (because of there special needs and the typical use of PHP see http://www.stanford.edu/class/ee380/Abstracts/100505.html).

Summary: So if somebody implements PHP in java (or on any fast VM) he doesn't need to write a super GC or JIT to be fast "only" a compiler (which can be simple).



回答3:

There are some hints about what the virtual machine does here. For example, it looks like the Java Virtual Machine first checks which parts of the bytecode are executed most often and then compiles the relevant parts into native code (which then should then execute with similar speed as e.g. compiled C code).

By the way, does PHP compile into bytecode or is it just interpreted using an in-memory data structure ? By translating PHP first into bytecode executable by the Java virtual machine, one benefits automatically from the existing (language-agnostic) optimizations of bytecode execution.