I have a little question that seems to be so easy in concept, but I cannot find the way to do it...
Say I have a data.frame df2 with a column listing car brands and another column with all the models per brand separated by ','. I have obtained df2 aggregating another data.frame named df1 with the primary key being the model.
How should I proceed to do the opposite task (i.e.: from df2 to df1)? My guess is something like melt(df2, id=unlist(strsplit('models',',')))
... Many thanks!
Here is a MWE:
df1 <- data.frame(model=c('a1','a2','a3','b1','b2','c1','d1','d2','d3','d4'),
brand=c('a','a','a','b','b','c','d','d','d','d'))
df1
collap <- function(x){
out <- paste(sort(unique(x)), collapse=",")
return (out)
}
df2 <- aggregate(df1$model, by=list(df1$brand), collap)
names(df2) <- c('brand','models')
df2 #how can I do the opposite task (ie: from df2 to df1)?
These days I would use
tidytext::unnest_tokens
for this task:Playing around, I have found a way to do the trick, even though it may be quite dirty:
It is not the best solution, since the data.frames actually have many more columns, and I would not want to go one by one... If someone knows a prettier way to solve this, I would appreciate it!
Here is how I would do it using the
plyr
packageAs a point of comparison, this is how to use the same package to go from
df1
todf2
:Here are two alternatives:
Use
data.table
andunlist
as follows:Use
concat.split.multiple
from my "splitstackshape" package. One nice thing with this approach is being able to split multiple columns with one simple command.