I have this nested data frame
test <- structure(list(id = c(13, 27), seq = structure(list(
`1` = c("1997", "1997", "1997", "2007"),
`2` = c("2007", "2007", "2007", "2007", "2007", "2007", "2007")),
.Names = c("1", "2"))), .Names = c("penr",
"seq"), row.names = c("1", "2"), class = "data.frame")
I want a list of all values in the second column, namely
result <- c("1997", "1997", "1997", "2007", "2007", "2007", "2007", "2007", "2007", "2007", "2007")
Is there an easy way to achieve this?
This line does the trick:
do.call("c", test[["seq"]])
or equivalent:
c(test[["seq"]], recursive = TRUE)
or even:
unlist(test[["seq"]])
The output of these functions is:
11 12 13 14 21 22 23 24 25 26 27
"1997" "1997" "1997" "2007" "2007" "2007" "2007" "2007" "2007" "2007" "2007"
To get rid of the names above the character vector, call as.character
on the resulting object:
> as.character((unlist(test[["seq"]])))
[1] "1997" "1997" "1997" "2007" "2007" "2007" "2007" "2007" "2007" "2007"
[11] "2007"
This is not an answer but a follow up/supplement to Paul's answer:
Consistently on any number of iterations the c method performs the best. However as I increased the number of iterations to 100000 unlist went from the poorest to very close to the c method.
1000 iterations
test replications elapsed relative user.self sys.self user.child sys.child
2 c 1000 0.04 1.333333 0.03 0 NA NA
1 do.call 1000 0.03 1.000000 0.03 0 NA NA
3 unlist 1000 0.23 7.666667 0.04 0 NA NA
100,000 iterations
test replications elapsed relative user.self sys.self user.child sys.child
2 c 100000 8.39 1.000000 3.62 0 NA NA
1 do.call 100000 10.47 1.247914 4.04 0 NA NA
3 unlist 100000 9.97 1.188319 3.81 0 NA NA
Again thanks for sharing Paul!
Benchmarking performed using rbenchmark
on a win 7 machine running R 2.14.1