Why is this simplistic cpp function version slower

2019-09-03 15:55发布

问题:

Consider this comparison:

require(Rcpp)
require(microbenchmark)

cppFunction('int cppFun (int x) {return x;}')
RFun = function(x) x

x=as.integer(2)
microbenchmark(RFun=RFun(x),cppFun=cppFun(x),times=1e5)

Unit: nanoseconds
   expr  min   lq      mean median   uq      max neval cld
   RFun  302  357  470.2047    449  513   110152 1e+06  a 
 cppFun 2330 2598 4045.0059   2729 2879 68752603 1e+06   b

cppFun seems slower than RFun. Why is it so? Do the times for calling the functions differ? Or is it the function itself that differ in running time? Is it the time for passing and returning arguments? Is there some data conversion or data copying I am unaware of when the data are passed to (or returned from) cppFun?

回答1:

This simply is not a well-posed or thought-out question as the comments above indicate.

The supposed baseline of an empty function simply is not one. Every function created via cppFunction() et al will call one R function interfacing to some C++ function. So this simply cannot be equal.

Here is a slightly more meaningful comparison. For starters, let's make the R function complete with curlies. Second, let's call another compiler (internal) function:

require(Rcpp)
require(microbenchmark)

cppFunction('int cppFun (int x) {return x;}')
RFun1 <- function(x) { x }
RFun2 <- function(x) { .Primitive("abs")(x) }

print(microbenchmark(RFun1(2L), RFun2(2L), cppFun(2L), times=1e5))

On my box, I see a) a closer gap between versions 1 and 2 (or the C++ function) and b) little penalty over the internal function. But calling ANY compiled function from R has cost.

Unit: nanoseconds
       expr min   lq     mean median   uq     max neval
  RFun1(2L) 171  338  434.136    355  408 2659984 1e+05
  RFun2(2L) 683  937 1334.046   1257 1341 7605628 1e+05
 cppFun(2L) 721 1131 1416.091   1239 1385 8544656 1e+05

As we say in the real world: there ain't no free lunch.