Install R Packages in Azure ML

2019-02-19 19:22发布

问题:

Its my 1st time using Azure ML and I am having a rough time. I need to install multiple R packages that are not provided by default in Azure ML. To make it simple, lets assume that I only need to install the forecast package.

Based on what is written here, I also need to plan the installation of the dependencies of the forecast package. However, based on the documentation, the forecast package has almost a dozen dependencies. Furthermore, these dependencies probably have dependencies that are not installed by default in Azure ML. In addition, it does not seem quite right to upload a zip file in Azure ML and to try to make all the dependencies work out.

Is there any other way to install the forecast package that is easier and simpler than what I found online? What do companies do? Uploading a zip file does not seem viable!

回答1:

You can use miniCRAN (https://cran.r-project.org/web/packages/miniCRAN/index.html) to build the zip file with all dependencies included then upload the zip file and use it to install your required packages. It also allows you to choose the target platform (type="win.binary") and R version (RVersion="3.1") which are crucial when using Azure ML. There is a tutorial here (http://blog.revolutionanalytics.com/2015/10/using-minicran-in-azure-ml.html) that outlines the steps.



回答2:

Unfortunately yes. You can do 2 things.

  1. First figure out which of the dependencies are already installed in azureml. See this blogpost

    Use Execute R script task in AML studio and copy paste the below script:

    out <- data.frame(installed.packages(,,,fields=”Description”))
    maml.mapOutputPort(“out”)
    
  2. Collect all depended packages (imports and linkingto) and add these to the zip file (in the correct order) and follow the information in the blogpost you linked to.

I use option 1, since that limits the amount of packages needed. But be aware of version differences on azureML and cran.



回答3:

There is another simple solution to upload custom package on Azure ML Studio. I have used quanteda as example. Empty the installed packages folder. . It is needed to avoid the confusion between the packages that already were in the local environment and these ones who has recently been installed. Next, install the package. During the installation it is important to keep of the packages being installed and the order of installation (as these packages may also have their own dependencies).

 - le package ‘chron’ a été décompressé et les sommes MD5 ont été
   vérifiées avec succés 
 - le package ‘RColorBrewer’ a été décompressé et
   les sommes MD5 ont été vérifiées avec succés
 - ...
 - le package ‘quanteda’ a été décompressé et les
   sommes MD5 ont été vérifiées avec succés

Find all the relevant packages in the same folder as mentioned above: the C:\Users\\Documents\R\win-library\. It is then needed to compress each package separately. To save time you may create a simple batch file the will zip (using 7z application) each folder in the directory (for /d %%X in (*) do "c:\Program Files\7-Zip\7z.exe" a "%%X.zip" "%%X\").

Then put all the archives into one and upload it on the Azure ML environment. NOTE: most of the packages that quanteda depends on are already installed on the Azure virtual machine so there is no need to install them manually. However, for the others, it is necessary to install them manually before installing the quanteda. You may either compare the list of dependencies with the available packages list or upload everything and add the packages step by step and look through the output log. For instance, if you install quanteda directly without installing its dependencies (install.packages("src/quanteda.zip", lib = ".", repos = NULL, verbose = TRUE)) with generate the following error:

Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : 
there is no package called 'ca'

Now it’s evident that all the packages that go before ‘ca’ are already pre-installed. So, adding additional installation will solve the issue. Thus, to install the quanteda the following commands are needed:

install.packages("src/ca.zip", lib = ".", repos = NULL, verbose = TRUE)
install.packages("src/quanteda.zip", lib = ".", repos = NULL, verbose = TRUE)
library(quanteda, lib.loc=".", verbose=TRUE)

You are now able to use your custom packages.