I have some trouble to convert my data.frame
from a wide table to a long table.
At the moment it looks like this:
Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246
Now I like to transform this data.frame
into a long data.frame
.
Something like this:
Code Country Year Value
AFG Afghanistan 1950 20,249
AFG Afghanistan 1951 21,352
AFG Afghanistan 1952 22,532
AFG Afghanistan 1953 23,557
AFG Afghanistan 1954 24,555
ALB Albania 1950 8,097
ALB Albania 1951 8,986
ALB Albania 1952 10,058
ALB Albania 1953 11,123
ALB Albania 1954 12,246
I have looked and tried it already with the melt()
and the reshape()
functions
as some people were suggesting similar questions.
However, so far I only get messy results.
If it is possible I would like to do it with the reshape()
function since
it looks a little bit nicer to handle.
Three alternative solutions:
1: With
reshape2
giving:
Some alternative notations that give the same result:
2: With
data.table
You can use the same
melt
function as in thereshape2
package (which is an extended & improved implementation).melt
fromdata.table
has also more parameters that themelt
-function fromreshape2
. You can for example also specify the name of the variable-column:Some alternative notations:
3: With
tidyr
Some alternative notations:
If you want to exclude
NA
values, you can addna.rm = TRUE
to themelt
as well as thegather
functions.Another problem with the data is that the values will be read by R as character-values (as a result of the
,
in the numbers). You can repair that withgsub
andas.numeric
:Or directly with
data.table
ordplyr
:Data:
Using reshape package:
reshape()
takes a while to get used to, just asmelt
/cast
. Here is a solution with reshape, assuming your data frame is calledd
:Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R:
stack
.Note, however, that
stack
does not work withfactor
s--it only works ifis.vector
isTRUE
, and from the documentation foris.vector
, we find that:I'm using the sample data from @Jaap's answer, where the values in the year columns are
factor
s.Here's the
stack
approach:Here is another example showing the use of
gather
fromtidyr
. You can select the columns togather
either by removing them individually (as I do here), or by including the years you want explicitly.Note that, to handle the commas (and X's added if
check.names = FALSE
is not set), I am also usingdplyr
's mutate withparse_number
fromreadr
to convert the text values back to numbers. These are all part of thetidyverse
and so can be loaded together withlibrary(tidyverse)
Returns: