I have 4 dataframes, each the index in a list. I would like to combine them altogether as one dataframe. In set language from mathematics, it would make most sense for this to be the union on the rownames. So I might have something like this:
U <- union(dfSub[[1]], dfSub[[2]], dfSub[[3]], dfSub[[4]])
The problem with the union
function is that it operates only on vectors. How can I get this to work on dataframes?
- How can I translate this into R?
- Is there a better way of achieving the desired result?
EDIT: How can I preserve rownames after the union?
First, bind them together:
df.cat <- rbind(dfSub[[1]], dfSub[[2]], dfSub[[3]], dfSub[[4]])
or better:
df.cat <- do.call(rbind, dfSub[1:4])
This first step requires that all data.frames have the same column names. If it is not the case, then you might be interested in the rbind.fill
function from the plyr
package:
library(plyr)
df.cat <- rbind.fill(dfSub[1:4])
Then, to remove duplicates if you need (as a set union would):
df.union <- unique(df.cat)
You can combine dataframes with the merge function. Since you have multiple dataframes you can use Reduce to merge them all at once.
merged.data <- Reduce(function(...) merge(...), list(dfSub[[1]], dfSub[[2]], dfSub[[3]], dfSub[[4]])
As an example:
> people <- c('Bob', 'Jane', 'Pat')
> height <- c(72, 64, 68)
> weight <- c(220, 130, 150)
> age <- c(45, 32, 35)
> height.data <- data.frame(people, height)
> weight.data <- data.frame(people, weight)
> age.data <- data.frame(people, age)
> height.data
people height
1 Bob 72
2 Jane 64
3 Pat 68
> weight.data
people weight
1 Bob 220
2 Jane 130
3 Pat 150
> age.data
people age
1 Bob 45
2 Jane 32
3 Pat 35
> Reduce(function(...) merge(...), list(height.data, weight.data, age.data))
people height weight age
1 Bob 72 220 45
2 Jane 64 130 32
3 Pat 68 150 35