SECOND EDIT: This pull request on github will fix the issue. As long as one is running Julia v0.5+, anonymous functions will be just as fast as regular functions. So case closed.
EDIT: I've updated the question and function definitions to a more general case.
For a simple example, the Julia compiler does not appear to optimize when a function is passed a function or a function is defined within a function. This surprises me as surely this is very common in optimization packages. Am I correct or am I doing something stupid? A simple example follows:
f(a::Int, b::Int) = a - b #A simple function
function g1(N::Int, fIn::Function) #Case 1: Passing in a function
z = 0
for n = 1:N
z += fIn(n, n)
end
end
function g2(N::Int) #Case 2: Function defined within a function
fAnon = f
z = 0
for n = 1:N
z += fAnon(n, n)
end
return(z)
end
function g3(N::Int) #Case 3: Function not defined within function
z = 0
for n = 1:N
z += f(n, n)
end
return(z)
end
Then I run the following code to time the three cases:
#Run the functions once
g1(10, f)
g2(10)
g3(10)
@time g1(100000000, f)
@time g2(100000000)
@time g3(100000000)
And the timings are:
elapsed time: 5.285407555 seconds (3199984880 bytes allocated, 33.95% gc time)
elapsed time: 5.424531599 seconds (3199983728 bytes allocated, 32.59% gc time)
elapsed time: 2.473e-6 seconds (80 bytes allocated)
Lots of memory allocation and garbage collection for the first two cases. Could anyone explain why?
So a fun thing to do is use @code_warntype
in Julia 0.4, which shows the following:
julia> @code_warntype g1(10, f)
Variables:
N::Int64
fIn::F
z::Any
#s1::Int64
n::Int64
Body:
begin # none, line 2:
z = 0 # line 3:
... snip ....
z = z + (fIn::F)(n::Int64,n::Int64)::Any::Any
So the problem is with inference of the return type of f
, which really could be anything. The problem (as I understand it) is that Julia compiles a method for each type combination. We've got code generated here for any function, so anything could come back. It'd be neat if Function
was parametric on the return type, because then we could do something smarter like Function{T<:Any,Int}
.
My solution was to change it to z += fIn(n, n)::Int
, and that allows z
to always be an Int
but I still see
(top(typeassert))((fIn::F)(n::Int64,n::Int64)::Any,Int)::Int64
in the @code_warntype
output, which makes sense, because it really is still an Any
, I'm just making sure that doesn't pollute the rest. But I think it still has to generate code to check it is in fact an Int
. Lets call this new version g1A
:
julia> @time g1(1000000, f)
elapsed time: 0.124437357 seconds (30 MB allocated, 2.82% gc time in 1 pauses with 0 full sweep)
elapsed time: 0.121653131 seconds (30 MB allocated, 2.51% gc time in 2 pauses with 0 full sweep)
elapsed time: 0.120805345 seconds (30 MB allocated, 1.17% gc time in 1 pauses with 0 full sweep)
julia> @time g1A(1000000, f)
elapsed time: 0.085875439 seconds (30 MB allocated, 5.20% gc time in 1 pauses with 0 full sweep)
elapsed time: 0.074592531 seconds (30 MB allocated, 4.67% gc time in 2 pauses with 0 full sweep)
elapsed time: 0.078681071 seconds (30 MB allocated, 4.75% gc time in 1 pauses with 0 full sweep)
So some gains, but not ideal. This is a known issue that goes deep into Julia's inner workings. Related discussion:
- #1090 return type declarations
- #210 function types
- #1864 anonymous function calls have a huge overhead
- #9863 Someone needs to write a FAQ entry on functions-as-variables
This is fixed in Julia v0.5. All three cases should give the same performance as g3
now.