I am trying to install the latest version of Sparkling Water that is compatible with my versions of h2o and Spark, from the following link Sparklin Water Nightly Bleeding Edge
I'm trying the following code:
install.packages("https://s3.amazonaws.com/h2o-release/sparkling-water/master/259_nightly/sparkling-water-2.3.259_nightly.zip",repos = NULL, type = "win.binary")
#install.packages('C:/Users/USER/Downloads/sparkling-water-2.3.259_nightly.zip',repos = NULL, type = "win.binary")
But it throws the following error
Warning in install.packages : cannot open compressed file
'sparkling-water-2.3.258_nightly/DESCRIPTION', probable reason 'No
such file or directory' Error in install.packages : cannot open the
connection
The latest stable version of rsparkling on CRAN can be installed as follows:
install.packages("rsparkling")
The installation works, but apparently it is not compatible with my version of h2o and / or Spark, because it does not work as_h2o_frame
function from rsparkling.
What can I do? To use rsparkling with my version of h2o
Note
- R Version: 3.4.4
packageVersion("sparklyr")
is ‘0.8.0’
packageVersion("h2o")
is ‘3.21.0.4359’
I solved this issue after going through quite a few trial-and-errors.
The first point is to make sure that you have got the right version of Java installed on your computer. Specifically, Java versions 9 and 10 can be problematic see here. I got Java SE Development Kit 8u172 installed. To make sure what Java version you have got installed and running, in your Terminal type:
java -version
Next, based on the table given here, I found a sweet compatibility spot between h2o
version 3.18.0.11 with spark
version 2.3.0 and Sparkling Water
version 2.3.6.
So, install the following packages:
- for h20 (version 3.18.0.11):
install.packages("https://cran.r-project.org/src/contrib/Archive/h2o/h2o_3.18.0.11.tar.gz", repos=NULL, type="source")
- for sparklyr (version 0.8.4) and rsparkling (version 0.2.5):
install.packages(c("sparklyr","rsparkling"))
Then, first configure the appropriate version of Sparkling Water before calling library(rsparkling). So, the rest of the code becomes:
options(rsparkling.sparklingwater.version = "2.3.6")
library(rsparkling)
library(sparklyr)
library(h2o)
Now, you should be able to go ahead and install the Spark version 2.3.0, and the rest:
spark_install(version = "2.3.0")
sc <- spark_connect(master = "local", version = "2.3.0")
mtcars_tbl <- copy_to(sc, mtcars, "mtcars")
mtcars_h2o <- as_h2o_frame(sc, mtcars_tbl, strict_version_check = FALSE)
Hope this works for you too!