Apply a function to a List of dataframes in R

2019-07-05 04:58发布

问题:

I need help in how to manage lists in an iterative way.

I have the following list list which is composed of several dataframes with same columns, but different number of rows. Example:

[[1]]
  id InpatientDays ERVisits OfficeVisits Narcotics
1  a             0        0           18         1
2  b             1        1            6         1
3  c             0        0            5         3
4  d             0        1           19         0
5  e             8        2           19         3
6  f             2        0            9         2

[[2]]
    id InpatientDays ERVisits OfficeVisits Narcotics
7   a            16        1            8         1
8   b             2        0            8         0
9   c             2        1            4         3
10  d             4        2            0         2
11  e             6        5           20         2
12  a             0        0            7         4

I would like to apply a function to get all the possible combinations for the id for each "data frame" in the list.

I intended to try something like this lapply(list1, function(x) combn(unique(list1[x]$id))) Which of course does not work.. expecting to get something like:

[[1]]  
    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,] "a"  "a"  "a"  "a"  "a"  "b"  "b"  "b"  "b"  "c"   "c"   "c"   "d"   "d"   "e"  
[2,] "b"  "c"  "d"  "e"  "f"  "c"  "d"  "e"  "f"  "d"   "e"   "f"   "e"   "f"   "f"  

[[2]] 
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] "a"  "a"  "a"  "a"  "b"  "b"  "b"  "c"  "c"  "d"  
[2,] "b"  "c"  "d"  "e"  "c"  "d"  "e"  "d"  "e"  "e" 

Is this possible? I know for sure this works for a single dataframe df

  combn(unique(df$id),2) 

回答1:

We need to use unique(x$id)

 lapply(list1, function(x) combn(unique(x$id),2))

The OP's code is looping the 'list1' using lapply. The anonymous function call (function(x)) returns each of the 'data.frame' within the list i.e. 'x' is the 'data.frame'. So, we just need to call x$id (or x[['id']]) to extract the 'id' column. In essence, 'x' is not an index. But, if we need to subset based on the index, we have to loop through the sequence of 'list1' (or if the list elements are named, then loop through the names of it)

lapply(seq_along(list1), function(i) combn(unique(list1[[i]]$id), 2))