How to efficiently implement closures in LLVM IR?

2020-05-22 04:40发布

问题:

I started adding closures (lambdas) to my language that uses LLVM as the backend. I have implemented them for simple cases where they can be always inlined i.e. code for the closure definition itself doesn't need to be generated, as it is inlined where used.

But how to generate the code for a closure in case the closure isn't always inlined (for example, it is passed to another function that isn't inlined). Preferably, the call sites shouldn't care whether they are passed regular functions or closures and would call them as normal functions.

I could generate a function with a synthetic name, but it would have to take the referencing environment as an extra argument and that function couldn't just be passed to another function that doesn't know about the needed extra argument.

I have thought of one possible solution using LLVM's trampoline intrinsics, which "excise" a single parameter from a function, returning a pointer to a trampoline function that takes one less parameter. In this case, if the function generated for the closure took the referencing environment as the first parameter, I could excise it and get back a function that takes exactly as many parameters as the closure actually declares. Does this sound doable? Efficient? Are there any better solutions?

Code example:

def applyFunctionTo(value: Int, f: (Int) -> Int) = f(value)

def main() = {
  val m := 4;
  val n := 5;
  val lambda := { (x: Int) => x + m + n };
  applyFunctionTo(3, lambda)
}

Now, lets imagine that this wouldn't get inlined to def main() = 3 + 4 + 5, and that applyFunctionTo would possibly be compiled separately, and we can't change the call site there. With trampolining, I imagine the generated code would be something like this (expressed in pseudocode, * means pointer):

def main$lambda$1(env: {m: Int, n: Int}*, x: Int) = x + env.m + env.n
def main() = {
  m = 4
  n = 5
  env* = allocate-space-for {Int, Int}
  env = {m, n}
  tramp* = create-trampoline-for(main$lambda$1*, env*)
  return applyFunctionTo(3, tramp*)
  // release memory for env and trampoline if the lambda didn't escape
}

Does this seem right?

回答1:

Sounds doable and efficient.

The alternative way, that does not need trampolines, is to define closure type as a pair of function pointer and pointer to environment ie stack pointer. In C calling convention the extra arguments are ignored so if you provide environment as last argument you can even use (function_ptr, null) as callback for regular function.



回答2:

A dumb idea would be that for each closure you generate a thread local structure to hold the required data (could be just a pointer to a local structure, or several pointers).

The creator of the closure is the responsible for setting the TLS variables and "saving" the state they had (to allow recursive call).

The user then calls the function normally, it's executed and use the environemnt.

After the call, the creator of the closure "restores" the original values into the TLS variables.