I am working with movie lens dataset, I have a matrix(m X n) of user id as row and movie id as columns and I have done dimension reduction technique and matrix factorization to reduce my sparse matrix (m X k, where k < n ). I want to evaluate the performance using the k-nearest neighbor algorithm (not library , my own code) . I am using sparkR 1.6.2. I don't know how to split my dataset into training data and test data in sparkR. I have tried native R function (sample, subset,CARET) but it is not compatible with spark data frame. kindly give some suggestion for performing cross-validation and training the classifier using my own function written in sparkR
相关问题
- R - Quantstart: Testing Strategy on Multiple Equit
- Using predict with svyglm
- Reshape matrix by rows
- Extract P-Values from Dunnett Test into a Table by
- split data frame into two by column value [duplica
相关文章
- How to convert summary output to a data frame?
- How to plot smoother curves in R
- Paste all possible diagonals of an n*n matrix or d
- ess-rdired: I get this error “no ESS process is as
- How to use doMC under Windows or alternative paral
- dyLimit for limited time in Dygraphs
- Saving state of Shiny app to be restored later
- How to insert pictures into each individual bar in
The sparklyr (https://spark.rstudio.com/) package provides simple functionality for partitioning data. For example, if we have a data frame called
df
in Spark we could create a copy of it withcompute()
then partition it withsdf_partition()
.df_part
Would then be a connection to a Spark DataFrame. We could usecollect()
to copy the Spark DataFrame into an R dataframe.