This is what I have:
id<-c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2)
measure<-c("speed","weight","time","speed","weight","time","speed","weight","time",
"speed","weight","time","speed","weight","time","speed","weight","time")
value<-c(1.23,10.3,33,1.44,10.4,31,1.21,10.1,33,4.25,12.5,38,1.74,10.8,31,3.21,10.3,33)
testdf<-data.frame(id,measure,value)
This is what I want:
id<-c(1,1,1,2,2,2)
speed<-c(1.23,1.44,1.21,4.25,1.74,3.21)
weight<-c(10.3,10.4,10.1,12.5,10.8,10.3)
time<-c(33,31,33,37,31,33)
res<-data.frame(id,speed,weight,time)
The issue lies in that my variables speed weight and time are repeated. I can get it done with a for loop with if statements but its a major headache and not very efficient. This is my first post on stackoverflow ... long time user first time question ... thanks yall!
Using rowid
from data.table (very similar to @Kelli-Jean's answer):
library(reshape2)
testdf$r <- data.table::rowid(testdf$measure);
dcast(testdf, id + r ~ measure)
id r speed time weight
1 1 1 1.23 33 10.3
2 1 2 1.44 31 10.4
3 1 3 1.21 33 10.1
4 2 4 4.25 38 12.5
5 2 5 1.74 31 10.8
6 2 6 3.21 33 10.3
Or in one line dcast(testdf, id + data.table::rowid(measure) ~ measure)
.
Or without data.table, add like testdf$r <- ave(testdf$id, testdf$meas, FUN = seq_along)
.
Or if you're up for learning the data.table package:
library(data.table)
setDT(testdf)
testdf[, r := rowid(measure)]
dcast(testdf, id + r ~ measure)
If you want to go the tidyverse route:
library(tidyr)
library(dplyr)
testdf %>%
# add unique id for rows to be able to use spread
group_by(measure) %>% mutate(unique_id = row_number()) %>%
spread(measure, value) %>% select(-unique_id )
The R Cookbook is a great resource for these types of questions: http://www.cookbook-r.com/Manipulating_data/Converting_data_between_wide_and_long_format/
Here is my solution
library(plyr)
a=daply(testdf, .(id, measure), function(x) x$value)
listdf=apply(a, c(3), function(x) rbind(data.frame(x,id=row.names(x))))
df <- ldply(listdf, data.frame)
df$.id=NULL
df <- df[order(df$id),]
df
speed time weight id
1 1.23 33 10.3 1
3 1.44 31 10.4 1
5 1.21 33 10.1 1
2 4.25 38 12.5 2
4 1.74 31 10.8 2
6 3.21 33 10.3 2
Install reshape2 to help reformat data first
Then create another identifier to help organize data by three sequential rows at a time as you have in your desired dataset.
A<-c("A","A","A")
B<-c("B","B","B")
C<-c("C","C","C")
D<-c("D","D","D")
E<-c("E","E","E")
F<-c("F","F","F")
A <- as.data.frame(A)
colnames(A) <- "id2"
B <- as.data.frame(B)
colnames(B) <- "id2"
C <- as.data.frame(C)
colnames(C) <- "id2"
D <- as.data.frame(D)
colnames(D) <- "id2"
E <- as.data.frame(E)
colnames(E) <- "id2"
F <- as.data.frame(F)
colnames(F) <- "id2"
Bind separate datasets together row wise
x<-rbind(A,B,C,D,E,F)
Bind this new identifier to testdf column wise
testdf <- cbind(testdf, x)
Shape data from long to wide format
x2<-dcast(testdf, id + id2 ~ measure, value.var="value")
This is the resulting dataset:
id id2 speed time weight
1 1 A 1.23 33 10.3
2 1 B 1.44 31 10.4
3 1 C 1.21 33 10.1
4 2 D 4.25 38 12.5
5 2 E 1.74 31 10.8
6 2 F 3.21 33 10.3
You can remove the id2 variable if it is necessary using
testdf$id2 <- NULL
UPDATE: Original answer as posted by @Frank:
> library(reshape2)
> testdf$r <- data.table::rowid(testdf$measure);
> dcast(testdf, id + r ~ measure)
id speed weight time
1 1 1.23 10.3 33
2 1 1.44 10.4 31
3 1 1.21 10.1 33
4 2 4.25 12.5 37
5 2 1.74 10.8 31
6 2 3.21 10.3 33