I have a data that looks like the following dataframe, but every combo has about ten fields, starting with name1, adress1, city1, etc
id name1 adress1 name2 adress2 name3 adress3
1 1 John street a Burt street d chris street 1
2 2 Jack street b Ben street e connor street 2
3 3 Joey <NA> Bob street f <NA> <NA>
Now I would like to rearrange this data so it is a bit more useful and it should look like so, but with the information from which entry it came from:
id origin names adresses
1 1 1 John street a
2 2 1 Jack street b
3 3 1 Joey <NA>
4 1 2 Burt street d
5 2 2 Ben street e
6 3 2 Bob street f
7 1 3 chris street 1
8 2 3 connor street 2
Using tidyr I can get a long format, but then I have a key column that contains all the variable names, name1, name2, name3, street1, etc.
I also tried using separate dataframes, one for each combination, e.g. one dataframe for the names, one for the streets, etc. But then joining everything back together results in the wrong records, because you can only join on id and in a long format this ID is replicated. I have also been looking into Reshape2, but that results in the same issue.
All the conversions of wide to long I have seen are when you have one column you want to convert to. I'm looking for the end result in 10 columns, or as in the example 2 columns.
Is there a function that I'm overlooking?
#code to generete the dataframes:
df <- data.frame(id = c(1,2,3),
name1 = c("John", "Jack", "Joey"),
adress1 = c("street a", "street b", NA),
name2 = c("Burt", "Ben", "Bob"),
adress2 = c("street d", "street e", "street f"),
name3 = c("chris", "connor", NA),
adress3 = c("street 1", "street 2", NA),
stringsAsFactors = FALSE)
expecteddf <- data.frame(id = c(1,2,3,1,2,3,1,2),
origin = c(rep(1, 3), rep(2, 3), rep(3, 2)),
names = c("John", "Jack", "Joey", "Burt", "Ben", "Bob", "chris", "connor"),
adresses = c("street a", "street b", NA, "street d", "street e", "street f", "street 1", "street 2"),
stringsAsFactors = FALSE
)