Applying the optim function in R in C++ with Rcpp

2020-06-27 15:32发布

问题:

I am trying to call R function optim() in Rcpp. I saw an example in Calling R's optim function from within C++ using Rcpp, but I am unable to modify it correctly for my use case. Basically, the objective function depends on the x and y but I want to optimize it with respect to b.

Here is the R code that does what I want:

example_r = function(b, x, y) {
  phi = rnorm(length(x))

  tar_val = (x ^ 2 + y ^ 2) * b * phi

  objftn_r = function(beta, x, y) {
    obj_val = (x ^ 2 + y ^ 2) * beta

    return(obj_val)
  }

  b1 = optim(b, function(beta) {
    sum((objftn_r(beta, x, y) - tar_val) ^ 2)
  }, method = "BFGS")$par

  result = (x ^ 2 + y ^ 2) * b1

  return(b1)
}

Here's is my attempt to translate it to _RcppArmadillo:

#include <RcppArmadillo.h>
using namespace Rcpp;

// [[Rcpp::depends(RcppArmadillo)]]

arma::vec example_rcpp(arma::vec b, arma::vec x, arma::vec y){

  arma::vec tar_val = pow(x,2)%b-pow(y,2);

  return tar_val;
}

// [[Rcpp::export]]
arma::vec optim_rcpp(const arma::vec& init_val, arma::vec& x, arma::vec& y){

  Rcpp::Environment stats("package:stats"); 
  Rcpp::Function optim = stats["optim"];

  Rcpp::List opt_results = optim(Rcpp::_["par"]    = init_val,
                                 Rcpp::_["fn"]     = Rcpp::InternalFunction(&example_rcpp),
                                 Rcpp::_["method"] = "BFGS");

  arma::vec out = Rcpp::as<arma::vec>(opt_results[0]);

  return out;
} 

However, this code is returning:

> optim_rcpp(1:3,2:4,3:5)
Error in optim_rcpp(1:3, 2:4, 3:5) : not compatible with requested type

I'm not sure what the error is here.

回答1:

Before we begin, I have a few remarks:

  1. Please show all of your attempt.
    • In particular, make sure your example is a minimal reproducible example
  2. Do not delete or shorten code unless asked.
  3. Keep the scope of your question narrow.
    • Using optim from R in C++ is very different than using in C++ the underlying C++ code for opt() from nlopt.
  4. Avoid spamming questions.
    • If you find yourself asking more than 3 questions in rapid succession, please read the documentation or talk in person with someone familiar with the content.

I've cleaned up your question as a result... But, in the future, this likely will not happen.

Data Generation Process

The data generation process seems to be done in 2 steps: First, outside of the example_r function, and, then inside the function.

This should be simplified so that it is done outside of the optimization function. For example:

generate_data = function(n, x_mu = 0, y_mu = 1, beta = 1.5) {

  x = rnorm(n, x_mu)
  y = rnorm(n, y_mu)

  phi = rnorm(length(x))

  tar_val = (x ^ 2 + y ^ 2) * beta * phi

  simulated_data = list(x = x, y = y, beta = beta, tar_val = tar_val)
  return(simulated_data)
}

Objective Functions and R's optim

Objective functions must return a single value, e.g. a scalar, in R. Under the posted R code, there was effectively two functions designed to act as an objective function in sequence, e.g.

objftn_r = function(beta, x, y) {
  obj_val = (x ^ 2 + y ^ 2) * beta

  return(obj_val)
}

b1 = optim(b, function(beta) {
  sum((objftn_r(beta, x, y) - tar_val) ^ 2)
}, method = "BFGS")$par

This objective function should therefore be re-written as:

objftn_r = function(beta_hat, x, y, tar_val) {

  # The predictions generate will be a vector
  est_val = (x ^ 2 + y ^ 2) * beta_hat

  # Here we apply sum of squares which changes it
  # from a vector into a single "objective" value
  # that optim can work with.
  obj_val = sum( ( est_val  - tar_val) ^ 2)

  return(obj_val)
}

From there, the calls should align as:

sim_data = generate_data(10, 1, 2, .3)

b1 = optim(sim_data$beta, fn = objftn_r, method = "BFGS",
           x = sim_data$x, y = sim_data$y, tar_val = sim_data$tar_val)$par

RcppArmadillo Objective Functions

Having fixed the scope and behavior of the R code, let's focus on translating it into RcppArmadillo.

In particular, notice that the objection function defined after the translation returns a vector and not a scalar into optim, which is not a single value. Also of concern is the lack of a tar_val parameter in the objective function. With this in mind, the objective function will translate to:

// changed function return type and 
// the return type of first parameter
double obj_fun_rcpp(double& beta_hat, 
                    arma::vec& x, arma::vec& y, arma::vec& tar_val){

  // Changed from % to * as it is only appropriate if  
  // `beta_hat` is the same length as x and y.
  // This is because it performs element-wise multiplication
  // instead of a scalar multiplication on a vector
  arma::vec est_val = (pow(x, 2) - pow(y, 2)) * beta_hat;

  // Compute objective value
  double obj_val = sum( pow( est_val - tar_val, 2) );

  // Return a single value
  return obj_val;
}

Now, with the objective function set, let's address the Rcpp call into R for optim() from C++. In this function, the parameters of the function must be explicitly supplied. So, x, y, and tar_val must be present in the optim call. Thus, we will end up with:

// [[Rcpp::export]]
arma::vec optim_rcpp(double& init_val,
                     arma::vec& x, arma::vec& y, arma::vec& tar_val){

  // Extract R's optim function
  Rcpp::Environment stats("package:stats"); 
  Rcpp::Function optim = stats["optim"];

  // Call the optim function from R in C++ 
  Rcpp::List opt_results = optim(Rcpp::_["par"]    = init_val,
                                 // Make sure this function is not exported!
                                 Rcpp::_["fn"]     = Rcpp::InternalFunction(&obj_fun_rcpp),
                                 Rcpp::_["method"] = "BFGS",
                                 // Pass in the other parameters as everything
                                 // is scoped environmentally
                                 Rcpp::_["x"] = x,
                                 Rcpp::_["y"] = y,
                                 Rcpp::_["tar_val"] = tar_val);

  // Extract out the estimated parameter values
  arma::vec out = Rcpp::as<arma::vec>(opt_results[0]);

  // Return estimated values
  return out;
}

All together

The full functioning code can be written in test_optim.cpp and compiled via sourceCpp() as:

#include <RcppArmadillo.h>

// [[Rcpp::depends(RcppArmadillo)]]

// changed function return type and 
// the return type of first parameter
// DO NOT EXPORT THIS FUNCTION VIA RCPP ATTRIBUTES
double obj_fun_rcpp(double& beta_hat, 
                    arma::vec& x, arma::vec& y, arma::vec& tar_val){

  // Changed from % to * as it is only appropriate if  
  // `beta_hat` is the same length as x and y.
  // This is because it performs element-wise multiplication
  // instead of a scalar multiplication on a vector
  arma::vec est_val = (pow(x, 2) - pow(y, 2)) * beta_hat;

  // Compute objective value
  double obj_val = sum( pow( est_val - tar_val, 2) );

  // Return a single value
  return obj_val;
}


// [[Rcpp::export]]
arma::vec optim_rcpp(double& init_val,
                     arma::vec& x, arma::vec& y, arma::vec& tar_val){

  // Extract R's optim function
  Rcpp::Environment stats("package:stats"); 
  Rcpp::Function optim = stats["optim"];

  // Call the optim function from R in C++ 
  Rcpp::List opt_results = optim(Rcpp::_["par"]    = init_val,
                                 // Make sure this function is not exported!
                                 Rcpp::_["fn"]     = Rcpp::InternalFunction(&obj_fun_rcpp),
                                 Rcpp::_["method"] = "BFGS",
                                 // Pass in the other parameters as everything
                                 // is scoped environmentally
                                 Rcpp::_["x"] = x,
                                 Rcpp::_["y"] = y,
                                 Rcpp::_["tar_val"] = tar_val);

  // Extract out the estimated parameter values
  arma::vec out = Rcpp::as<arma::vec>(opt_results[0]);

  // Return estimated values
  return out;
}

Test Case

# Setup some values
beta = 2
x = 2:4
y = 3:5

# Set a seed for reproducibility
set.seed(111)

phi = rnorm(length(x))

tar_val = (x ^ 2 + y ^ 2) * beta * phi

optim_rcpp(beta, x, y, tar_val)
#          [,1]
# [1,] 2.033273

Note: If you would like to avoid a matrix of size 1 x1 from being returned please use double as the return parameter of optim_rcpp and switch Rcpp::as<arma::vec> to Rcpp::as<double>