This question already has an answer here:
I have some R code:
time.read = system.time(df <- data.frame(fread(f)))
print(class(time.read))
#[1] "proc_time"
print(class(df))
#[1] "data.frame"
Somehow when this is executed, in the main R environment/scope:
- time.read has a value
- df exists and contains the correct data.frame
I thought variables created inside a function were not available outside of the function's scope? How does this work? And why after running the following does y not exist in the main R environment?
fx <- function(z){return(1)}
out = fx(y <- 300)
print(out)
#[1] 1
print(y)
#Error in print(y) : object 'y' not found
Thanks!
Great question! R does something peculiar with its argument, which causes a lot of confusion but is also very useful.
When you pass an argument into a function in R, it doesn’t get evaluated until it’s actually used inside the function. Before that, the argument just sits around in a special container called a promise. Promises hold an expression and the environment in which they are supposed to be evaluated – for arguments, that’s the caller’s environment.
But as soon as you use the argument inside the function, its value is computed. This is how
system.time
works. Simplified:In other words, the function simply records the time before looking at its argument. Then it looks at its argument and thus causes its evaluation, and then it records the time elapsed. But remember that the evaluation of the argument happens in the caller’s scope, so in your case the target of the assignment (
df
) is also visible in the parent scope.In your second example, your function
fx
never looks at its argument, so it never gets evaluated. You can easily change that, forcing the evaluation of its argument, simply by using it:In fact, R has a special function –
force
for this purpose:But
force
is simply syntactic sugar, and its definition is simply to return its argument:The fact that R doesn’t evaluate its arguments immediate is useful because you can also retrieve the unevaluated form inside the function. This is known as non-standard evaluation, and it’s sometimes used to evaluate the expression in a different scope (using the
eval
function with its argumentenvir
specified), or to retrieve information about the unevaluated, expression.Many functions use this, most prominently
plot
, which guesses default axis labels based on the plotted variables/expressions:Now the axis labels are
x
andsin(x)
. Theplot
function knows this because inside it, it can look at the unevaluated expressions of its function arguments:substitute
retrieves the unevaluated expression.deparse
converts it into a string representation.