I have a csv file with 2 columns:
col1-
Timestamp
data(yyyy-mm-dd hh:mm:ss.ms (8 months data))col2 : Heat data (continuous variable) .
Since there are almost 50k record, I would like to partition the col1(timestamp col) into months or weeks and then apply box plot on the heat data w.r.t timestamp.
I tried in R,it takes a long time. Need help to do in Python. I think I need to use seaborn.boxplot
.
Please guide.
Group by Frequency then plot groups
First Read your csv data into a Pandas DataFrame
I will use some fake data, 30 days of hourly samples.
Set the timestamps as the DataFrame's index
Now group by by the period you want, seven days for this example
Now you can plot each group separately
And... I didn't realize you could do this but it is pretty cool
To partition the data in five time periods then get weekly boxplots of each:
Determine the total timespan; divide by five; create a frequency alias; then groupby
Each group is a DataFrame so iterate over the groups; create weekly groups from each and boxplot them.
There might be a better way to do this, if so I'll post it or maybe someone will fill free to edit this. Looks like this could lead to the last group not having a full set of data. ...
If you know that your data is periodic you can just use slices to split it up.
Frequency aliases
pandas.read_csv()
pandas.Grouper()