R: How can I make a new Variable with numbers of o

2019-01-28 07:34发布

问题:

I´m new to R and I have to deal with a large data set. I googled a lot but I just can´t find the way to do what i need (although it sounds like an easy thing to do).

What I want to do is reshape my data in a wide form. To do it in the way that I want, I need a new variable with numbers of order by dates for every factor (that will start with one for each new factor).

Now, this is a small example of what I have:

ID<-c("A","A","A","B","B","C","D","D","D","D")

Date<-c("01-01-2014", "05-01-2014", "06-01-2014",
        "01-01-2014", "12-01-2014", "25-01-2014", 
        "06-01-2014", "12-01-2014", "25-01-2014", 
        "26-01-2014")

Value<-c(2.5, 3.4, 2.5, 305.66, 300.00, 55.01,
        205.32, 99.99, 210.25, 105.125)

mydata<-data.frame(ID, Date, Value)
mydata

ID       Date   Value
1   A 01-01-2014   2.500
2   A 05-01-2014   3.400
3   A 06-01-2014   2.500
4   B 01-01-2014 305.660
5   B 12-01-2014 300.000
6   C 25-01-2014  55.010
7   D 06-01-2014 205.320
8   D 12-01-2014  99.990
9   D 25-01-2014 210.250
10  D 26-01-2014 105.125

(Data set is sorted first by ID factor, than by date for each factor.)

And this is what I need: new variable called "Order".

   ID       Date   Value Order
1   A 01-01-2014   2.500     1
2   A 05-01-2014   3.400     2
3   A 06-01-2014   2.500     3
4   B 01-01-2014 305.660     1
5   B 12-01-2014 300.000     2
6   C 25-01-2014  55.010     1
7   D 06-01-2014 205.320     1
8   D 12-01-2014  99.990     2
9   D 25-01-2014 210.250     3
10  D 26-01-2014 105.125     4

The end goal is to reshape data based on the variable "Order" like this:

library(reshape)
goal<-reshape(mydata2, 
              idvar="ID",
              timevar="Order",
              direction="wide")
goal

   ID     Date.1  Value.1     Date.2  Value.2     Date.3  Value.3     Date.4  Value.4
1  A  01-01-2014    2.50  05-01-2014    3.40  06-01-2014    2.50        <NA>      NA
4  B  01-01-2014  305.66  12-01-2014  300.00        <NA>       NA       <NA>      NA
6  C  25-01-2014   55.01        <NA>      NA        <NA>       NA       <NA>      NA
7  D  06-01-2014  205.32  12-01-2014   99.99  25-01-2014   210.25   26-01-2014 105.125

Or is there another way to reshape data like this without the "Order" Variable?

回答1:

This is precisely what the getanID function in my "splitstackshape" package is for:

> library(splitstackshape)
> getanID(mydata, "ID")
    ID       Date   Value .id
 1:  A 01-01-2014   2.500   1
 2:  A 05-01-2014   3.400   2
 3:  A 06-01-2014   2.500   3
 4:  B 01-01-2014 305.660   1
 5:  B 12-01-2014 300.000   2
 6:  C 25-01-2014  55.010   1
 7:  D 06-01-2014 205.320   1
 8:  D 12-01-2014  99.990   2
 9:  D 25-01-2014 210.250   3
10:  D 26-01-2014 105.125   4

Alternatively, you can explore the development version of "data.table" which reimplements dcast in a very flexible way that will allow you to do this transformation without needing to generate a "time" variable.



标签: r reshape