This question already has an answer here:
I have multiple somewhat irregular time series (each in a CSV file) like so:
X.csv
date,time,value
01/01/04,00:15:00,4.98
01/01/04,00:25:00,4.981
01/01/04,00:35:00,4.983
01/01/04,00:55:00,4.986
and so:
Y.csv
date,time,value
01/01/04,00:05:00,9.023
01/01/04,00:15:00,9.022
01/01/04,00:35:00,9.02
01/01/04,00:45:00,9.02
01/01/04,00:55:00,9.019
Notice how there's basically a granularity of 10 mins in both files, but each has some missing entries.
I would now like to merge these two time series achieve the following:
date,time,X,Y
01/01/04,00:05:00,NA,9.023
01/01/04,00:15:00,4.98,9.022
01/01/04,00:25:00,4.981,NA
01/01/04,00:35:00,4.983,9.02
01/01/04,00:45:00,NA,9.02
01/01/04,00:55:00,4.986,9.019
Is there an easy way of achieving this? Since I have multiple files (not just two), is there a way of doing this for a batch of files?
You can use dplyr to do this. First read in all the files from group X and group Y using a do loop, so that you end up with just one file for each. Then full_join the results.
Getting your data :
gets us
same with Y:
now convert X,Y to xts-objects and merge the 2 objects with an
outer join
to get all the data points.The last step is to sum the values by rows:
If you don’t need the x,y columns anymore:
and you get: