Julia: invoke a function by a given string

2020-02-03 07:20发布

问题:

Does Julia support the reflection just like java?

What I need is something like this:

str = ARGS[1] # str is a string
# invoke the function str()

回答1:

The Good Way

The recommended way to do this is to convert the function name to a symbol and then look up that symbol in the appropriate namespace:

julia> fn = "time"
"time"

julia> Symbol(fn)
:time

julia> getfield(Main, Symbol(fn))
time (generic function with 2 methods)

julia> getfield(Main, Symbol(fn))()
1.448981716732318e9

You can change Main here to any module to only look at functions in that module. This lets you constrain the set of functions available to only those available in that module. You can use a "bare module" to create a namespace that has only the functions you populate it with, without importing all name from Base by default.

The Bad Way

A different approach that is not recommended but which many people seem to reach for first is to construct a string for code that calls the function and then parse that string and evaluate it. For example:

julia> eval(parse("$fn()")) # NOT RECOMMENDED
1.464877410113412e9

While this is temptingly simple, it's not recommended since it is slow, brittle and dangerous. Parsing and evaling code is inherently much more complicated and thus slower than doing a name lookup in a module – name lookup is essentially just a hash table lookup. In Julia, where code is just-in-time compiled rather than interpreted, eval is much slower and more expensive since it doesn't just involve parsing, but also generating LLVM code, running optimization passes, emitting machine code, and then finally calling a function. Parsing and evaling a string is also brittle since all intended meaning is discarded when code is turned into text. Suppose, for example, someone accidentally provides an empty function name – then the fact that this code is intended to call a function is completely lost by accidental similarity of syntaxes:

julia> fn = ""
""

julia> eval(parse("$fn()"))
()

Oops. That's not what we wanted at all. In this case the behavior is fairly harmless but it could easily be much worse:

julia> fn = "println(\"rm -rf /important/directory\"); time"
"println(\"rm -rf /important/directory\"); time"

julia> eval(parse("$fn()"))
rm -rf /important/directory
1.448981974309033e9

If the user's input is untrusted, this is a massive security hole. Even if you trust the user, it is still possible for them to accidentally provide input that will do something unexpected and bad. The name lookup approach avoids these issues:

julia> getfield(Main, Symbol(fn))()
ERROR: UndefVarError: println("rm -rf /important/directory"); time not defined
 in eval(::Module, ::Any) at ./boot.jl:225
 in macro expansion at ./REPL.jl:92 [inlined]
 in (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:46

The intent of looking up a name and then calling it as a function is explicit, instead of implicit in the generated string syntax, so at worst one gets an error about a strange name being undefined.

Performance

If you're going to call a dynamically specified function in an inner loop or as part of some recursive computation, you will want to avoid doing a getfield lookup every time you call the function. In this case all you need to do is make a const binding to the dynamically specified function before defining the iterative/recursive procedure that calls it. For example:

fn = "deg2rad" # converts angles in degrees to radians

const f = getfield(Main, Symbol(fn))

function fast(n)
    t = 0.0
    for i = 1:n
        t += f(i)
    end
    return t
end

julia> @time fast(10^6) # once for JIT compilation
  0.010055 seconds (2.97 k allocations: 142.459 KB)
8.72665498661791e9

julia> @time fast(10^6) # now it's fast
  0.003055 seconds (6 allocations: 192 bytes)
8.72665498661791e9

julia> @time fast(10^6) # see?
  0.002952 seconds (6 allocations: 192 bytes)
8.72665498661791e9

The binding f must be constant for optimal performance, since otherwise the compiler can't know that you won't change f to point at another function at any time (or even something that's not a function), so it has to emit code that looks f up dynamically on every loop iteration – effectively the same thing as if you manually call getfield in the loop. Here, since f is const, the compiler knows f can't change so it can emit fast code that just calls the right function directly. But the compiler can sometimes do even better than that – in this case it actually inlines the implementation of the deg2rad function, which is just a multiplication by pi/180:

julia> @code_llvm fast(100000)

define double @julia_fast_51089(i64) #0 {
top:
  %1 = icmp slt i64 %0, 1
  br i1 %1, label %L2, label %if.preheader

if.preheader:                                     ; preds = %top
  br label %if

L2.loopexit:                                      ; preds = %if
  br label %L2

L2:                                               ; preds = %L2.loopexit, %top
  %t.0.lcssa = phi double [ 0.000000e+00, %top ], [ %5, %L2.loopexit ]
  ret double %t.0.lcssa

if:                                               ; preds = %if.preheader, %if
  %t.04 = phi double [ %5, %if ], [ 0.000000e+00, %if.preheader ]
  %"#temp#.03" = phi i64 [ %2, %if ], [ 1, %if.preheader ]
  %2 = add i64 %"#temp#.03", 1
  %3 = sitofp i64 %"#temp#.03" to double
  %4 = fmul double %3, 0x3F91DF46A2529D39         ; deg2rad(x) = x*(pi/180)
  %5 = fadd double %t.04, %4
  %6 = icmp eq i64 %"#temp#.03", %0
  br i1 %6, label %L2.loopexit, label %if
}

If you need to do this with many different dynamically specified functions and you're using Julia 0.5 (nightly), then you can even pass the function to be called in as an argument:

function fast(f,n)
    t = 0.0
    for i = 1:n
        t += f(i)
    end
    return t
end

julia> @time fast(getfield(Main, Symbol(fn)), 10^6)
  0.007483 seconds (1.70 k allocations: 76.670 KB)
8.72665498661791e9

julia> @time fast(getfield(Main, Symbol(fn)), 10^6)
  0.002908 seconds (6 allocations: 192 bytes)
8.72665498661791e9

This generates the same fast code as single-argument fast above, but will generate a new version for every different function f that you call it with.