Dynamic data frame creation in R with custom names

2019-09-08 17:17发布

问题:

I'd like to create data frames dynamically and assign custom names to it.

I have a master data set like this:

ID    grp    val1    val2
1      a      32       9
1      b      21       31
1      c      43       76
2      a      23       67
2      b      5        45
2      c      65       76
3      a      43       34
3      b      43       7
3      c      12       87
4      a      43       35
4      b      65       87
4      c      21       55

I'd like to create data frames like

data1:
ID    grp    val1    val2
1      a      32       9
1      b      21      31
1      c      43      76

data2:
ID    grp    val1    val2
2      a      23       67
2      b      5        45
2      c      65       76

and so on...

I have tried some things like:

myID<-1:4
df <- paste('data',myID, sep ='')
ll <- sapply(df, function(x)
{
  data.frame ()
  df<-masterData[which(masterData$ID==myID),]
})

Another try without desired results:

sapply(myID,function(x) df<-as.data.frame(masterData[which(masterData$ID==myID,]))

I guess subset will not do it for multiple values:

myframes<-list(subset(masterData,masterData$ID==myID))

回答1:

I would just use split and keep them all in a list:

split(masterData, masterData$ID)
# $`1`
#   ID grp val1 val2
# 1  1   a   32    9
# 2  1   b   21   31
# 3  1   c   43   76
# 
# $`2`
#   ID grp val1 val2
# 4  2   a   23   67
# 5  2   b    5   45
# 6  2   c   65   76
# 
# $`3`
#   ID grp val1 val2
# 7  3   a   43   34
# 8  3   b   43    7
# 9  3   c   12   87
# 
# $`4`
#    ID grp val1 val2
# 10  4   a   43   35
# 11  4   b   65   87
# 12  4   c   21   55

If you really want to litter your workplace with lots of data.frames, instead of keeping everything in a tidy list, you can use list2env:

X <- split(masterData, masterData$ID)
names(X) <- paste0("data", names(X))
list2env(X, envir=.GlobalEnv)
# <environment: R_GlobalEnv>

ls(pattern = "^data[0-9]$")             ## What did that create?
# [1] "data1" "data2" "data3" "data4"
data1
#   ID grp val1 val2
# 1  1   a   32    9
# 2  1   b   21   31
# 3  1   c   43   76


回答2:

@Ananda Mahto's solution is compact and clean. You can also modify your code to get the result:

 setNames(lapply(seq_along(df), function(i) masterData[masterData$ID==myID[i],]),df)

and then use list2env



回答3:

Another answer is using plyr's dlply function

library(plyr)
dlply(dat, .(ID))

$`1`
ID grp val1 val2
1  1   a   32    9
2  1   b   21   31
3  1   c   43   76

$`2`
ID grp val1 val2
1  2   a   23   67
2  2   b    5   45
3  2   c   65   76

$`3`
ID grp val1 val2
1  3   a   43   34
2  3   b   43    7
3  3   c   12   87

$`4`
ID grp val1 val2
1  4   a   43   35
2  4   b   65   87
3  4   c   21   55