bokeh year on year line graph procedure

2019-04-13 16:54发布

问题:

what is the best way for doing year over year line charts with daily data in bokeh?

currently im adding a dateline (arbitrarily for 2016) and year column to inital dataframe of daily values. Then pivoting to wide data by year filling in NAs (missing data varies across years) and then building bokeh graph line by line across the year cols:

Say I have a table of three years data:

Column: Date and Value

df = df.set_index('Date')

df['dateline'] = df.index.to_series().dt.strftime('%d-%b-2016')
df['year'] = df.index.to_series().dt.strftime('%Y')

pv = pd.pivot_table(df, index=df['dateline'], columns=df.index.year,
                    values='value', aggfunc='sum')

pv.index = pd.to_datetime(pv.index, format = '%d-%b-%Y' )
pv.sort_index(inplace=True)
pv = pv.apply(lambda x: x.fillna(method = 'ffill' , limit = 4))


p.line(x= pv.index , y = pv[2017], line_width=1.5, line_color = "red" ,legend = '2017')
p.line(x= pv.index , y = pv[2016], line_width=1.5, line_color = "blue" ,legend = '2016')
p.line(x= pv.index , y = pv[2015], line_width=1.5, line_color = "green" , legend = '2015')
p.line(x= pv.index , y = pv[2014], line_width=1.5, line_color = "orange" ,legend = '2014')

Question i have is can this be further optimized? I would like to use hover in the future so what would be the best set up? Next step would be loops over years column but do I need to go that route?

Coming from R I would like to keep data in long format and do something like:

p.line(df, x='dateline' , y = 'value' , color = 'year')

Thanks for the tips.

回答1:

One solution is to take your dates and create a year column and a day of year column using the .dt accessors

Be sure that df['date'] is a datetime column.

df['year'] = df['date'].dt.year
df['dayofyear'] = df['date'].dt.dayofyear

df.head()

            year     value  dayofyear
date                                 
2014-01-31  2014  1.964372         31
2014-02-28  2014  2.386228         59
2014-03-31  2014  2.695743         90
2014-04-30  2014  2.712133        120
2014-05-31  2014  2.033271        150


from bokeh.charts import Line
p = Line(df,x='dayofyear', y='value',color='year')
show(p)