accumulating functions and closures in R

2019-08-28 03:33发布

问题:

I am constructing an approximating function recursively (adaboost). I would like to create the resulting learning function along the way (not to apply the approximation directly to my test data but keep the function that leads to it)

unfortunately, it seems that R updates the value to which a variable name refers to long after it is used.

#defined in plyr as well
id <- function(x) {x}

#my first classifier 
modelprevious <- function(inputx, k) { k(0)}

#one step of my superb model
modelf <- function(x) 2*x #for instance

#I update my classifier
modelCurrent <- function(inputx, k) 
                 { modelprevious(inputx, function(res) {k(res + modelf(inputx))})}

#it works
modelCurrent(2,id) #4

#Problem
modelf <- function(x) 3*x
modelCurrent(2,id) #6 WTF !! 

The same function with the same argument return something different, which is quite annoying !

So how is it possible to capture the value represented by modelf so that the resulting function only depends on its argument at the time of the binding, and not of some global state ?


Given that problem I dont see how one can do a recursive function building in R if one can not touch local variable, apart going through ugly hacks of quote/parse

回答1:

You need a factory:

modelCurrent = function(mf){
  return(function(inputx,k){
    modelprevious(
      inputx,
      function(res){
        k(res+mf(inputx))
      } # function(res)
      ) # modelprevious
  } # inner function
         ) # return
} # top function

Now you use the factory to create models with the modelf function that you want it to use:

> modelf <- function(x) 2*x
> m1 = modelCurrent(modelf)
> m1(2,id)
[1] 4
> modelf <- function(x) 3*x
> m1(2,id) # no change.
[1] 4

You can always make them on an ad-hoc basis:

> modelCurrent(modelf)(2,id)
[1] 6

and there you can see the factory created a function using the current definition of modelf, so it multiplied by three.

There's one last ginormous WTF!?! that will hit you. Watch carefully:

> modelf <- function(x) 2*x
> m1 = modelCurrent(modelf)
> m1(2,id)
[1] 4
>
> m1 = modelCurrent(modelf) # create a function using the 2* modelf
> modelf <- function(x) 3*x # change modelf...
> m1(2,id) # WTF?!
[1] 6

This is because when the factory is called, mf isn't evaluated - that's because the inner function isn't called, and mf isn't used until the inner function is called.

The trick is to force evaluation of the mf in the outer function, typically using force:

modelCurrent = function(mf){
  force(mf)
  return(function(inputx,k){
    modelprevious(
      inputx,
      function(res){
        k(res+mf(inputx))
      } # function(res)
      ) # modelprevious
  } # inner function
         ) # return
} # top function

This has lead me to premature baldness, because if you forget this and think there's some odd bug going on, and then try sticking print(mf) in place to see what's going on, you'll be evaluating mf and thus getting the behaviour you wanted. By inspecting the data, you changed it! A Heisenbug!



标签: r closures