Reshape Long to Wide Data in R [duplicate]

I am trying to reshape some user data in R. I have a data.frame of session IDs. Each session has a User_ID and date. I would like to use the "User_ID" variable as my "Key" but only for the observations that have "userType" of "New Visitor". Therefore, there will be a single row for each "New Visitor". Then pass each subsequent Session ID as separate variable. For instance, if a User ID has 3 Session IDs in total, there would be a total of 6 variables:

For instance, if this is the data frame for a user:

    date <- c('2015-01-01','2015-01-02','2015-01-02','2015-01-10')
    userID <- c('100105276','100105276','100105276','100105276')
    sessionID <- c('1452632119','1452634303','1452637067','1453600979')
    userType <- c('New Visitor','Returning Visitor','Returning Visitor','Returning Visitor')
    df <- cbind(date,userID,sessionID,userType)

Instead, I would like to return this:

    userID      sessionID1  date1      SessionID2  date2      SesionID3 date3
    100105276   1452632119  2015-01-01 1452634303  2015-01-02 100105276 2015-01-02

If there are any userIDs that did not have subsequent sessionIDs, a "na" value would be passed where variables are missing values. I've read up on using tidyr or reshape2 to do this, but I haven't been able to get them to do exactly what I am looking for.

标签： r reshape2 tidyr

1条回答

Anthone

2楼-- · 2020-05-02 07:51

Given your data is ordered by userID and sessionID, and each row is a unique session, you could do:

library(data.table)

# Transform data into data.frame
df <- data.table(df)
df[, id := sequence(.N), by = c("userID")] # session sequence number per user

# Spread columns
reshape(df, timevar = "id", idvar = "userID", direction = "wide")
#     userID     date.1 sessionID.1  userType.1     date.2 sessionID.2        userType.2     date.3 sessionID.3        userType.3     
#1 100105276 2015-01-01  1452632119 New Visitor 2015-01-02  1452634303 Returning Visitor 2015-01-02  1452637067 Returning Visitor

In this output userType is also included as a variable, but you can always drop them afterwards.

0人赞添加讨论(0) 举报

Reshape Long to Wide Data in R [duplicate]

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间