What is the difference between @code_native, @code

While going through julia, I wanted to have a functionality similar to python's dis module. Going through over the net, I found out that the Julia community have worked over this issue and given these (https://github.com/JuliaLang/julia/issues/218)

finfer -> code_typed
methods(function, types) -> code_lowered
disassemble(function, types, true) -> code_native
disassemble(function, types, false) -> code_llvm

I have tried these personally using the Julia REPL, but I quite seem to find it hard to understand.

In Python, I can disassemble a function like this.

>>> import dis
>>> dis.dis(lambda x: 2*x)
  1           0 LOAD_CONST               1 (2)
              3 LOAD_FAST                0 (x)
              6 BINARY_MULTIPLY     
              7 RETURN_VALUE        
>>>

Can anyone who has worked with these help me understand them more? Thanks.

标签： julia

1条回答

Animai°情兽

2楼-- · 2020-01-30 06:45

The standard CPython implementation of Python parses source code and does some pre-processing and simplification of it – aka "lowering" – transforming it to a machine-friendly, easy-to-interpret format called "bytecode". This is what is displayed when you "disassemble" a Python function. This code is not executable by the hardware – it is "executable" by the CPython interpreter. CPython's bytecode format is fairly simple, partly because that's what interpreters tend to do well with – if the bytecode is too complex, it slows down the interpreter – and partly because the Python community tends to put a high premium on simplicity, sometimes at the cost of high performance.

Julia's implementation is not interpreted, it is just-in-time (JIT) compiled. This means that when you call a function, it is transformed to machine code which is executed directly by the native hardware. This process is quite a bit more complex than the parsing and lowering to bytecode that Python does, but in exchange for that complexity, Julia gets its hallmark speed. (The PyPy JIT for Python is also much more complex than CPython but also typically much faster – increased complexity is a fairly typical cost for speed.) The four levels of "disassembly" for Julia code give you access to the representation of a Julia method implementation for particular argument types at different stages of the transformation from source code to machine code. I'll use the following function which computes the next Fibonacci number after its argument as an example:

function nextfib(n)
    a, b = one(n), one(n)
    while b < n
        a, b = b, a + b
    end
    return b
end

julia> nextfib(5)
5

julia> nextfib(6)
8

julia> nextfib(123)
144

Lowered code. The @code_lowered macro displays code in a format that is the closest to Python byte code, but rather than being intended for execution by an interpreter, it's intended for further transformation by a compiler. This format is largely internal and not intended for human consumption. The code is transformed into "single static assignment" form in which "each variable is assigned exactly once, and every variable is defined before it is used". Loops and conditionals are transformed into gotos and labels using a single unless/goto construct (this is not exposed in user-level Julia). Here's our example code in lowered form (in Julia 0.6.0-pre.beta.134, which is just what I happen to have available):

julia> @code_lowered nextfib(123)
CodeInfo(:(begin
        nothing
        SSAValue(0) = (Main.one)(n)
        SSAValue(1) = (Main.one)(n)
        a = SSAValue(0)
        b = SSAValue(1) # line 3:
        7:
        unless b < n goto 16 # line 4:
        SSAValue(2) = b
        SSAValue(3) = a + b
        a = SSAValue(2)
        b = SSAValue(3)
        14:
        goto 7
        16:  # line 6:
        return b
    end))

You can see the SSAValue nodes and unless/goto constructs and label numbers. This is not that hard to read, but again, it's also not really meant to be easy for human consumption. Lowered code doesn't depend on the types of the arguments, except in as far as they determine which method body to call – as long as the same method is called, the same lowered code applies.

Typed code. The @code_typed macro presents a method implementation for a particular set of argument types after type inference and inlining. This incarnation of the code is similar to the lowered form, but with expressions annotated with type information and some generic function calls replaced with their implementations. For example, here is the type code for our example function:

julia> @code_typed nextfib(123)
CodeInfo(:(begin
        a = 1
        b = 1 # line 3:
        4:
        unless (Base.slt_int)(b, n)::Bool goto 13 # line 4:
        SSAValue(2) = b
        SSAValue(3) = (Base.add_int)(a, b)::Int64
        a = SSAValue(2)
        b = SSAValue(3)
        11:
        goto 4
        13:  # line 6:
        return b
    end))=>Int64

Calls to one(n) have been replaced with the literal Int64 value 1 (on my system the default integer type is Int64). The expression b < n has been replaced with its implementation in terms of the slt_int intrinsic ("signed integer less than") and the result of this has been annotated with return type Bool. The expression a + b has been also replaced with its implementation in terms of the add_int intrinsic and its result type annotated as Int64. And the return type of the entire function body has been annotated as Int64.

Unlike lowered code, which depends only on argument types to determine which method body is called, the details of typed code depend on argument types:

julia> @code_typed nextfib(Int128(123))
CodeInfo(:(begin
        SSAValue(0) = (Base.sext_int)(Int128, 1)::Int128
        SSAValue(1) = (Base.sext_int)(Int128, 1)::Int128
        a = SSAValue(0)
        b = SSAValue(1) # line 3:
        6:
        unless (Base.slt_int)(b, n)::Bool goto 15 # line 4:
        SSAValue(2) = b
        SSAValue(3) = (Base.add_int)(a, b)::Int128
        a = SSAValue(2)
        b = SSAValue(3)
        13:
        goto 6
        15:  # line 6:
        return b
    end))=>Int128

This is the typed version of the nextfib function for an Int128 argument. The literal 1 must be sign extended to Int128 and the result types of operations are of type Int128 instead of Int64. The typed code can be quite different if the implementation of a type is considerably different. For example nextfib for BigInts is significantly more involved than for simple "bits types" like Int64 and Int128:

julia> @code_typed nextfib(big(123))
CodeInfo(:(begin
        $(Expr(:inbounds, false))
        # meta: location number.jl one 164
        # meta: location number.jl one 163
        # meta: location gmp.jl convert 111
        z@_5 = $(Expr(:invoke, MethodInstance for BigInt(), :(Base.GMP.BigInt))) # line 112:
        $(Expr(:foreigncall, (:__gmpz_set_si, :libgmp), Void, svec(Ptr{BigInt}, Int64), :(&z@_5), :(z@_5), 1, 0))
        # meta: pop location
        # meta: pop location
        # meta: pop location
        $(Expr(:inbounds, :pop))
        $(Expr(:inbounds, false))
        # meta: location number.jl one 164
        # meta: location number.jl one 163
        # meta: location gmp.jl convert 111
        z@_6 = $(Expr(:invoke, MethodInstance for BigInt(), :(Base.GMP.BigInt))) # line 112:
        $(Expr(:foreigncall, (:__gmpz_set_si, :libgmp), Void, svec(Ptr{BigInt}, Int64), :(&z@_6), :(z@_6), 1, 0))
        # meta: pop location
        # meta: pop location
        # meta: pop location
        $(Expr(:inbounds, :pop))
        a = z@_5
        b = z@_6 # line 3:
        26:
        $(Expr(:inbounds, false))
        # meta: location gmp.jl < 516
        SSAValue(10) = $(Expr(:foreigncall, (:__gmpz_cmp, :libgmp), Int32, svec(Ptr{BigInt}, Ptr{BigInt}), :(&b), :(b), :(&n), :(n)))
        # meta: pop location
        $(Expr(:inbounds, :pop))
        unless (Base.slt_int)((Base.sext_int)(Int64, SSAValue(10))::Int64, 0)::Bool goto 46 # line 4:
        SSAValue(2) = b
        $(Expr(:inbounds, false))
        # meta: location gmp.jl + 258
        z@_7 = $(Expr(:invoke, MethodInstance for BigInt(), :(Base.GMP.BigInt))) # line 259:
        $(Expr(:foreigncall, ("__gmpz_add", :libgmp), Void, svec(Ptr{BigInt}, Ptr{BigInt}, Ptr{BigInt}), :(&z@_7), :(z@_7), :(&a), :(a), :(&b), :(b)))
        # meta: pop location
        $(Expr(:inbounds, :pop))
        a = SSAValue(2)
        b = z@_7
        44:
        goto 26
        46:  # line 6:
        return b
    end))=>BigInt

This reflects the fact that operations on BigInts are pretty complicated and involve memory allocation and calls to the external GMP library (libgmp).

LLVM IR. Julia uses the LLVM compiler framework to generate machine code. LLVM defines an assembly-like language which it uses as a shared intermediate representation (IR) between different compiler optimization passes and other tools in the framework. There are three isomorphic forms of LLVM IR:

A binary representation that is compact and machine readable.
A textual representation that is verbose and somewhat human readable.
An in-memory representation that is generated and consumed by LLVM libraries.

Julia uses LLVM's C++ API to construct LLVM IR in memory (form 3) and then call some LLVM optimization passes on that form. When you do @code_llvm you see the LLVM IR after generation and some high-level optimizations. Here's LLVM code for our ongoing example:

julia> @code_llvm nextfib(123)

define i64 @julia_nextfib_60009(i64) #0 !dbg !5 {
top:
  br label %L4

L4:                                               ; preds = %L4, %top
  %storemerge1 = phi i64 [ 1, %top ], [ %storemerge, %L4 ]
  %storemerge = phi i64 [ 1, %top ], [ %2, %L4 ]
  %1 = icmp slt i64 %storemerge, %0
  %2 = add i64 %storemerge, %storemerge1
  br i1 %1, label %L4, label %L13

L13:                                              ; preds = %L4
  ret i64 %storemerge
}

This is the textual form of the in-memory LLVM IR for the nextfib(123) method implementation. LLVM is not easy to read – it's not intended to be written or read by people most of the time – but it is thoroughly specified and documented. Once you get the hang of it, it's not hard to understand. This code jumps to the label L4 and initializes the "registers" %storemerge1 and %storemerge with the i64 (LLVM's name for Int64) value 1 (their values are derived differently when jumped to from different locations – that's what the phi instruction does). It then does a icmp slt comparing %storemerge with register %0 – which holds the argument untouched for the entire method execution – and saves the comparison result into the register %1. It does an add i64 on %storemerge and %storemerge1 and saves the result into register %2. If %1 is true, it branches back to L4 and otherwise it branches to L13. When the code loops back to L4 the register %storemerge1 gets the previous values of %storemerge and %storemerge gets the previous value of %2.

Native code. Since Julia executes native code, the last form a method implementation takes is what the machine actually executes. This is just binary code in memory, which is rather hard to read, so long ago people invented various forms of "assembly language" which represent instructions and registers with names and have some amount of simple syntax to help express what instructions do. In general, assembly language remains close to one-to-one correspondence with machine code, in particular, one can always "disassemble" machine code into assembly code. Here's our example:

julia> @code_native nextfib(123)
    .section    __TEXT,__text,regular,pure_instructions
Filename: REPL[1]
    pushq   %rbp
    movq    %rsp, %rbp
    movl    $1, %ecx
    movl    $1, %edx
    nop
L16:
    movq    %rdx, %rax
Source line: 4
    movq    %rcx, %rdx
    addq    %rax, %rdx
    movq    %rax, %rcx
Source line: 3
    cmpq    %rdi, %rax
    jl  L16
Source line: 6
    popq    %rbp
    retq
    nopw    %cs:(%rax,%rax)

This is on an Intel Core i7, which is in the x86_64 CPU family. It only uses standard integer instructions, so it doesn't matter beyond that what the architecture is, but you can get different results for some code depending on the specific architecture of your machine, since JIT code can be different on different systems. The pushq and movq instructions at the beginning are a standard function preamble, saving registers to the stack; similarly, popq restores the registers and retq returns from the function; nopw is a 2-byte instruction that does nothing, included just to pad the length of the function. So the meat of the code is just this:

    movl    $1, %ecx
    movl    $1, %edx
    nop
L16:
    movq    %rdx, %rax
Source line: 4
    movq    %rcx, %rdx
    addq    %rax, %rdx
    movq    %rax, %rcx
Source line: 3
    cmpq    %rdi, %rax
    jl  L16

The movl instructions at the top initialize registers with 1 values. The movq instructions move values between registers and the addq instruction adds registers. The cmpq instruction compares two registers and jl either jumps back to L16 or continues to return from the function. This handful of integer machine instructions in a tight loop is exactly what executes when your Julia function call runs, presented in slightly more pleasant human-readable form. It's easy to see why it runs fast.

If you're interested in JIT compilation in general as compared to interpreted implementations, Eli Bendersky has a great pair of blog posts where he goes from a simple interpreter implementation of a language to a (simple) optimizing JIT for the same language:

0人赞添加讨论(0) 举报

What is the difference between @code_native, @code

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间