Converting columns into rows in r [duplicate]

2019-02-28 01:46发布

I have the below data formed using code

test <- data.frame(dis = c(10,20,30,40),dur=c(30,40,60,90),method=c("car","car","Bicycle","Bicycle"),to_lon=c(-1.980,-1.5678,-1.324,-1.456),to_lat=c(55.3009,55.3416,55.1123,55.2234),from_lon=c(-1.4565,-1.3424,-1.4566,-1.1111),from_lat=c(76.8888,65.8999,76.9088,25.3344))

 dis dur  method  to_lon  to_lat from_lon from_lat
1  10  30     car -1.9800 55.3009  -1.4565  76.8888
2  20  40     car -1.5678 55.3416  -1.3424  65.8999
3  30  60 Bicycle -1.3240 55.1123  -1.4566  76.9088
4  40  90 Bicycle -1.4560 55.2234  -1.1111  25.3344

I want to convert this data frame such that it has one row for to_lat and to_lon and in the next row it has from_lat and from_lon. The rest of the details do not need to change and can be replicated. The desired result should be as below

    dis dur method  longitude   latitude
from    10  30  car -1.98   55.3009
to  10  30  car -1.4565 76.8888
from    20  40  car -1.5678 55.3416
to  20  40  car -1.3424 65.8999
from    30  60  Bicycle -1.324  55.1123
to  30  60  Bicycle -1.4566 76.9088
from    40  90  Bicycle -1.456  55.2234
to  40  90  Bicycle -1.1111 25.3344

Any help will be much appreciated.

Thanks.

3条回答
相关推荐>>
2楼-- · 2019-02-28 02:02

We can use melt from data.table which can take multiple measure columns.

library(data.table)
dM <- melt(setDT(test), measure=patterns('lon', 'lat'), 
          value.name=c('longitude', 'latitude'))
#change the 'variable' column from numeric index to 'from/to'
dM[, variable:= c('from', 'to')[variable]]
#create a sequence column grouped by 'variable'
dM[,i1:= 1:.N ,variable]
#order based on the 'i1'
res <- dM[order(i1)][,i1:=NULL]
res
#    dis dur  method variable longitude latitude
#1:  10  30     car     from   -1.9800  55.3009
#2:  10  30     car       to   -1.4565  76.8888
#3:  20  40     car     from   -1.5678  55.3416
#4:  20  40     car       to   -1.3424  65.8999
#5:  30  60 Bicycle     from   -1.3240  55.1123
#6:  30  60 Bicycle       to   -1.4566  76.9088
#7:  40  90 Bicycle     from   -1.4560  55.2234
#8:  40  90 Bicycle       to   -1.1111  25.3344
查看更多
太酷不给撩
3楼-- · 2019-02-28 02:07

This may not be the most elegant solution, but it should work and is hopefully understandable:

We split the data into two dataframes: one with the 'from' longitude and latitude data (call it testF) and the other with the 'to' data (call it test). We then use rbind to insert the rows of 'testF' in the appropriate places in 'test'.

test <- data.frame(dis = c(10,20,30,40),dur=c(30,40,60,90),method=c("car","car","Bicycle","Bicycle"),to_lon=c(-1.980,-1.5678,-1.324,-1.456),to_lat=c(55.3009,55.3416,55.1123,55.2234),from_lon=c(-1.4565,-1.3424,-1.4566,-1.1111),from_lat=c(76.8888,65.8999,76.9088,25.3344))

testF <- test[,c(1:3,6,7)]
names(testF)[4:5] <- c("lonitude", "latitude")
test <- test[,1:5]
names(test)[4:5] <- c("lonitude", "latitude")

for(i in dim(test)[1]:1) {
  test <- rbind(test[1:i,], testF[i,], test[-(1:i),])
}
查看更多
时光不老,我们不散
4楼-- · 2019-02-28 02:26

Here is an alternative approach using package tidyr (a popular package for data munging), which avoids for loop.

library(tidyr)

test <- data.frame(dis = c(10,20,30,40),dur=c(30,40,60,90),method=c("car","car","Bicycle","Bicycle"),to_lon=c(-1.980,-1.5678,-1.324,-1.456),to_lat=c(55.3009,55.3416,55.1123,55.2234),from_lon=c(-1.4565,-1.3424,-1.4566,-1.1111),from_lat=c(76.8888,65.8999,76.9088,25.3344))
test$id <- 1:dim(test)[1]

# gather latitude columns
d1 <- gather(data = test, 
             key = direction, 
             value = latitude, 
             to_lat, from_lat)

# gather longitude columns
d2 <- gather(data = test, 
             key = direction, 
             value = longitude, 
             to_lon, from_lon)

d3 <- cbind(d1[,c("direction","dis","dur","method","latitude")],d2[,c("longitude","id"),drop=FALSE])

# Create names
dir <- unlist(strsplit(d3$direction,"_"))
dir <- dir[seq(from = 1, to = length(dir), by = 2)]

# Factor and sort
d3$direction <- factor(dir)
d3[order(d3$id),]
查看更多
登录 后发表回答