I have a dataframe in a wide format, with repeated measurements taken within different date ranges. In my example there are three different periods, all with their corresponding values. E.g. the first measurement (Value1
) was measured in the period from DateRange1Start
to DateRange1End
:
ID DateRange1Start DateRange1End Value1 DateRange2Start DateRange2End Value2 DateRange3Start DateRange3End Value3
1 1/1/90 3/1/90 4.4 4/5/91 6/7/91 6.2 5/5/95 6/6/96 3.3
I'm looking to reshape the data to a long format such that the DateRangeXStart and DateRangeXEnd columns are grouped,. Thus, what was 1 row in the original table becomes 3 rows in the new table:
ID DateRangeStart DateRangeEnd Value
1 1/1/90 3/1/90 4.4
1 4/5/91 6/7/91 6.2
1 5/5/95 6/6/96 3.3
I know there must be a way to do this with reshape2
/melt
/recast
/tidyr
, but I can't seem to figure it out how to map the multiple sets of measure variables into single sets of value columns in this particular way.
Using recycling:
You don't need anything fancy; base
R
functions will do.Two additional options (with an example dataframe with more than one row to better show the working of the code):
1) with base R:
which gives:
2) with the
tidyverse
:3) with the
sjmisc
-package:If you also want a group/time column, you can adapt the approaches above to:
1) with base R:
which gives:
2) with the
tidyverse
:3) with the
sjmisc
-package:Used data:
data.table
'smelt
function can melt into multiple columns. Using that, we can simply do:Alternatively, you can also reference the three sets of measure columns by the column position:
Here is an approach to the problem using
tidyr
. This is an interesting use case for its functionextract_numeric()
, which I used to pull out the group from the column names(Added the v.names per Josh's suggestion.)