In R, how to easily combine many vectors into a da

2019-04-16 02:45发布

问题:

So, I know that I can use the data.frame command to combine multiple vectors into one dataframe, like so:

my.data <- data.frame(name1, age1, name20, age20, name38, age38)

(I know, the variable names don't make much sense, why would I want name1, name20 and name38 in three different columns? Please ignore that: my actual variable names are different -- this is only for illustration purposes).

Problem is, I have about 40 of such vectors, and I have to combine many vectors in other parts of my code as well. So it would be convenient for me not to copy-paste a huge chunk of code every time. So I thought of writing a for loop around this:

for (i in c("name", "age", "hgt"))
{
    for (k in c(1,20,38))
    {
    my.data$i,as.character(k) <- data.frame(get(paste(i,as.character(k),sep="")))
    }
}

But this doesn't work. Is this because I should write "paste()" around some of this code, or is this simply a bad way to approach this problem? What is the correct way to loop through i and k and get a "newdata" dataframe as the final result with all the vectors as columns attached?

回答1:

Are you trying to achieve something like the following, perhaps?

name1 <- letters[1:10]
age1 <- 1:10
name20 <- letters[11:20]
age20 <- 11:20
name38 <- LETTERS[1:10]
age38 <- 21:30

paste.pattern <- paste(rep(c("name", "age"), times = 3), 
                       rep(c(1, 20, 38), each = 2), sep = "")

newdata <- data.frame(sapply(paste.pattern, get))


回答2:

If all your individual vectors have a similar stem (eg: hgt) and number (e.g. 1) arrangement, then you could do something like this:

# test data
name1 <- letters[1:10]
age1 <- 1:10
name20 <- letters[11:20]
age20 <- 11:20
name38 <- LETTERS[1:10]
age38 <- 21:30

# group them up in a dataframe looking for "age" OR (|) "name" as the stem
data.frame(sapply(ls(pattern="^age[0-9]+$|^name[0-9]+$"),get))

# result:
   age1 age20 age38 name1 name20 name38
1     1    11    21     a      k      A
2     2    12    22     b      l      B
3     3    13    23     c      m      C
4     4    14    24     d      n      D
5     5    15    25     e      o      E
6     6    16    26     f      p      F
7     7    17    27     g      q      G
8     8    18    28     h      r      H
9     9    19    29     i      s      I
10   10    20    30     j      t      J

This will limit the included vectors to the stem/number naming pattern to make sure you don't get any surprise additions to your dataframe.