I have fallowing data frame:
> my.df
x y
1 0.4597406 0.8439140
2 0.4579697 0.7461805
3 0.5593259 0.6646701
4 0.3607346 0.7792931
5 0.8377520 1.0445919
6 0.5597406 1.0445919
I want to create all possible combinations
> my.df
x y
1 0.4597406 0.8439140
2 0.4597406 0.7461805
3 0.4597406 0.6646701
4 0.4597406 0.7792931
5 0.4597406 1.0445919
6 0.4597406 1.0445919
7 0.4579697 0.8439140
8 0.4579697 0.7461805
9 0.4579697 0.6646701
...
(Not all the combinations are showing here - This is to show the format that I would like to get the resulting data frame)
Using following functions didn't really give the exact combinations.
expand.grid(my.df)
Whats the best way to generate all possible combinations.
Maybe we can use expand.grid
in the following way
expand.grid(x = my.df$x, y = my.df$y)
We can just use expand.grid
res <- expand.grid(my.df)
dim(res)
#[1] 36 2
Or with data.table
library(data.table)
setDT(my.df)[,CJ(x,y)]
A Cross Join
is helpful in this situation. Since you didnt provide a reproducible example. I have create my own datset.
df=data.frame(x=runif(5), y=runif(5))
xx=data.frame(df$x)
yy=data.frame(df$y)
library(sqldf)
sqldf("SELECT * FROM xx CROSS JOIN yy")
expand.grid() will give you all the possible combinations but not the unique combinations. If you need the latter you can use a function like this
unique_comb <- function(data){
x.cur <- unique(data$x)
y.cur <- unique(data$y)
n.x <- length(x.cur)
n.y <- length(y.cur)
matrix.com <- matrix(0,ncol=2,nrow=n.x*n.y)
ind <- 1
for(i in 1:n.x){
for(j in 1:n.y){
matrix.com[ind,] <- c(x.cur[i],y.cur[j])
ind <- ind+1
}
}
return(matrix.com)
}
Or as JTT points that this can be done in one line with
expand.grid(unique(data$x),unique(data$y))
You could use the merge function this way
dat <- cars[1:6,1:2]
dat
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
merge(dat$speed,dat$dist,by=NULL)
x y
1 4 2
2 4 2
3 7 2
4 7 2
5 8 2
6 9 2
7 4 10
8 4 10
9 7 10
10 7 10
11 8 10
12 9 10
13 4 4
14 4 4
15 7 4
16 7 4
17 8 4
18 9 4
19 4 22
20 4 22
21 7 22
22 7 22
23 8 22
24 9 22
25 4 16
26 4 16
27 7 16
28 7 16
29 8 16
30 9 16
31 4 10
32 4 10
33 7 10
34 7 10
35 8 10
36 9 10
I know everyone's throwing expand.grid()
at you, so here's another option...
my.df <- structure(list(x = c(0.4597406, 0.4579697, 0.5593259, 0.3607346, 0.837752, 0.5597406),
y = c(0.843914, 0.7461805, 0.6646701, 0.7792931, 1.0445919, 1.0445919)),
.Names = c("x", "y"), row.names = c(NA, -6L), class = "data.frame")
my.df
#> x y
#> 1 0.4597406 0.8439140
#> 2 0.4579697 0.7461805
#> 3 0.5593259 0.6646701
#> 4 0.3607346 0.7792931
#> 5 0.8377520 1.0445919
#> 6 0.5597406 1.0445919
tidyr
has a complete()
function which "completes" your data combinations, which I believe is what you're after.
tidyr::complete(my.df, x, y)
#> # A tibble: 30 x 2
#> x y
#> <dbl> <dbl>
#> 1 0.3607346 0.6646701
#> 2 0.3607346 0.7461805
#> 3 0.3607346 0.7792931
#> 4 0.3607346 0.8439140
#> 5 0.3607346 1.0445919
#> 6 0.4579697 0.6646701
#> 7 0.4579697 0.7461805
#> 8 0.4579697 0.7792931
#> 9 0.4579697 0.8439140
#> 10 0.4579697 1.0445919
#> # ... with 20 more rows
Note: this produces the unique combinations - your expected output rows 5 and 6 are identical.