Finding Growth in Dataframe in R

2020-02-29 06:26发布

Suppose I have the following data frame

Website <- rep(paste("Website",1:3),2)
Year <- c(rep(2013,3),rep(2014,3))
V1 <- c(10,20,50,20,30,70)
V2 <- c(5,15,30,15,30,45)
df <- data.frame(Website,Year,V1,V2)
df


    Website Year V1 V2
1 Website 1 2013 10  5
2 Website 2 2013 20 15
3 Website 3 2013 50 30
4 Website 1 2014 20 15
5 Website 2 2014 30 30
6 Website 3 2014 70 45

What I want to find is the growth for each website from year 2013 to 2014 i.e. (x1 - x0)/x0 for both variables. This would result in a data frame that does the following

    Website  V1  V2
1 Website 1 1.0 2.0
2 Website 2 0.5 1.0
3 Website 3 0.4 0.5

This is just the growth rates for each Website for both variables, V1 and V2.

标签: r
2条回答
聊天终结者
2楼-- · 2020-02-29 06:51

Assuming that you have more years, dplyr handles it beautifully.

library(dplyr)
growth <- function(x)x/lag(x)-1
df %>% 
  group_by(Website) %>% 
  mutate_each(funs(growth), V1, V2)
#    Website Year  V1  V2
#1 Website 1 2013  NA  NA
#2 Website 2 2013  NA  NA
#3 Website 3 2013  NA  NA
#4 Website 1 2014 1.0 2.0
#5 Website 2 2014 0.5 1.0
#6 Website 3 2014 0.4 0.5
查看更多
家丑人穷心不美
3楼-- · 2020-02-29 06:55

A data.table option (I am using data.table_1.9.5 that introduced the function shift). Assuming that the year column is "ordered", convert the "data.frame" to "data.table" using setDT, loop through the columns ("V1", "V2") with lapply (specify the columns in .SDcols) and do the calculation for individual columns (x/shift(x)...). The default setting for shift is type='lag' and n=1L. If you want to remove the NA rows, you can use na.omit which is also fast in the devel version.

library(data.table)
na.omit(setDT(df)[, lapply(.SD, function(x)
              x/shift(x) - 1), by=Website, .SDcols=3:4])
#     Website  V1  V2
#1: Website 1 1.0 2.0
#2: Website 2 0.5 1.0
#3: Website 3 0.4 0.5
查看更多
登录 后发表回答