Two Time series data with different time stamps an

2019-08-26 22:01发布

I have 2-time series data with different time stamps and a different number of data points.

First data frame is:

      Time         Power_kW
10/9/2017 1:14:12   0.185
10/9/2017 1:14:53   0.182
10/9/2017 1:15:13   0.184
10/9/2017 1:15:53   0.175
. . 
. . 
10/9/2017 1:44:37   0.175
. . 
. . 
10/9/2017 2:13:38   0.181
. . 
. . 
10/9/2017 2:24:40   0.179

Second dataframe is:

    Local Time    Value
10/9/2017 1:13:01   0
10/9/2017 1:42:10   1
10/9/2017 2:11:58   0
10/9/2017 2:23:30   1

The second data frame values indicate the value in between 2 dates mentioned. It means the value of '0' should be propagated from 10/9/2017 1:13:01 0 to 10/9/2017 1:42:10. and the value 1 from the date of 10/9/2017 1:42:10 up to 10/9/2017 2:23:30 and so on.

I wish to have the same number of data points as the first data frame by merging these two data frames columns values. I have many similar data frames as the Second dataframe which all will help me making correlations with the respect to the time.

I am trying to get the following result after merging:

   Time            Power_kW Value
10/9/2017 1:14:12   0.185   0
10/9/2017 1:14:53   0.182   0
10/9/2017 1:15:13   0.184   0
10/9/2017 1:15:53   0.175   0
. .     
. .     
10/9/2017 1:44:37   0.175   1
10/9/2017 1:45:47   0.176   1
. .     
10/9/2017 2:13:38   0.181   0
. .     
. .     
10/9/2017 2:24:40   0.179   1

(I have added some points in the result to represent how the whole data should look like.)

I did refer to these threads and their purpose are not propagating through.

R: merge two irregular time series

merge two time series with different time granularities

Can anyone please throw me a light?

2条回答
萌系小妹纸
2楼-- · 2019-08-26 22:15

Try a rolling join using R's data.table package, which is explained nicely in this blog:

library(data.table)
df1 <- fread("Time,Power_kW
10/9/2017 1:14:12,0.185
10/9/2017 1:14:53,0.182
10/9/2017 1:15:13,0.184
10/9/2017 1:15:53,0.175
10/9/2017 1:44:37,0.175
10/9/2017 1:45:47,0.176 
10/9/2017 2:13:38,0.181
10/9/2017 2:24:40,0.179")
df2 <- fread("LocalTime,Value
10/9/2017 1:13:01,0
10/9/2017 1:42:10,1
10/9/2017 2:11:58,0
10/9/2017 2:23:30,1")
df1$Time <- as.POSIXct(df1$Time, format="%m/%d/%Y %T")
df2$LocalTime <- as.POSIXct(df2$LocalTime, format="%m/%d/%Y %T")
setkey(df1, Time)
setkey(df2, LocalTime)
df2[df1, roll=Inf]
# LocalTime Value Power_kW
# 1: 2017-10-09 01:14:12     0    0.185
# 2: 2017-10-09 01:14:53     0    0.182
# 3: 2017-10-09 01:15:13     0    0.184
# 4: 2017-10-09 01:15:53     0    0.175
# 5: 2017-10-09 01:44:37     1    0.175
# 6: 2017-10-09 01:45:47     1    0.176
# 7: 2017-10-09 02:13:38     0    0.181
# 8: 2017-10-09 02:24:40     1    0.179
查看更多
看我几分像从前
3楼-- · 2019-08-26 22:26

Assume the inputs given in the Note at the end. Convert the data frames to zoo and run na.locf to fill in the NA values. Finally select the elements corresponding to the first table and optionally convert back to data frame:

library(zoo)

# convert to zoo
Power_kW <- read.zoo(DF1, tz = "")
Value <- read.zoo(DF2, tz = "")

z <- na.locf(merge(Power_kW, Value))[time(Power_kW)]
fortify.zoo(z) # optional

giving:

                Index Power_kW Value
1 2017-10-09 01:14:12    0.185     0
2 2017-10-09 01:14:53    0.182     0
3 2017-10-09 01:15:13    0.184     0
4 2017-10-09 01:15:53    0.175     0
5 2017-10-09 01:44:37    0.175     1
6 2017-10-09 01:45:47    0.176     1
7 2017-10-09 02:13:38    0.181     0
8 2017-10-09 02:24:40    0.179     1

Note: We assume the following inputs. Note that we assume POSIXct date times.

Lines1 <- "
Time,Power_kW
10/9/2017 1:14:12,0.185
10/9/2017 1:14:53,0.182
10/9/2017 1:15:13,0.184
10/9/2017 1:15:53,0.175
10/9/2017 1:44:37,0.175
10/9/2017 1:45:47,0.176 
10/9/2017 2:13:38,0.181
10/9/2017 2:24:40,0.179"

Lines2 <- "
LocalTime,Value
10/9/2017 1:13:01,0
10/9/2017 1:42:10,1
10/9/2017 2:11:58,0
10/9/2017 2:23:30,1"

fmt <- "%m/%d/%Y %H:%M:%S"
DF1 <- read.csv(text = Lines1)   
DF2 <- read.csv(text = Lines2)

# convert date/time to POSIXct
fmt <- "%m/%d/%Y %H:%M:%S"
DF1$Time <- as.POSIXct(DF1$Time, format = fmt)
DF2$LocalTime <- as.POSIXct(DF2$LocalTime, format = fmt)
查看更多
登录 后发表回答