My data frame contains date values in the format YYYY-MM-DD HH-MM-SS
across 125000+ rows, broken down by the minute (each row represents a single minute).
1 2018-01-01 00:04:00
2 2018-01-01 00:05:00
3 2018-01-01 00:06:00
4 2018-01-01 00:07:00
5 2018-01-01 00:08:00
6 2018-01-01 00:09:00
...
124998 2018-03-29 05:07:00
124999 2018-03-29 05:08:00
125000 2018-03-29 05:09:00
I want to subset the data by extracting all of the minute values within any given hour and saving the results into individual data frames.
I have used subset()
combined with grepl()
to no avail. I have tried setting start =
and stop =
parameters but also to no avail.
What I want to do is for every HH
value, I want to extract all rows with corresponding HH
values and then create a new data frame for each respective HH
value.
For example, I would like to have a data frame that corresponds to every minute's values (the full hour's worth of data), resulting in data frames such as:
2018-01-01 00:00:00
(contains data from2018-01-01 00:00:00
to2018-01-01 00:59:00
(inclusive))2018-01-01 01:00:00
(contains data from2018-01-01 01:00:00
to2018-01-01 01:59:00
(inclusive))
and so on.
Is there a quick way to achieve this or is it a more laborious task?
Note: I am aware that my desired result will produce a lot of data frames, and that is fine for my particular project as I will only be working on a single one-hour block at any one time.
If you want to access each individual date value,
lubridate
has default functions for that.So you can get the same splits (but in a more cumbersome manner) by doing:
The dummy data
This will produce a list of data frames grouped by each hour, assuming your data frame is called
data
and your first column isV1
I have come up with a solution which extracts every minute (
MM
) value/row from the main data frame:To separate it for each hour, I will simply change the first
00
depending on which hour I want to focus on and I can then perform a similar function to extract each individual date value.