My DataFrame object looks like
amount
date
2014-01-06 1
2014-01-07 1
2014-01-08 4
2014-01-09 1
2014-01-14 1
I would like a sort of scatter plot with time along the x-axis, and amount on the y, with a line through the data to guide the viewer's eye. If I use the panadas plot df.plot(style="o")
it's not quite right, because the line is not there. I would like something like the examples here.
note: this has a lot in common with Ian Thompson's answer but the approach is different enough to have it be a separate answer. I use the DataFrame format provided in the question and avoid changing the index.
Seaborn and other libraries don't deal as well with datetime axes as you might like them to. Here's how I'd work around it:
Start by adding a column of date ordinals
Seaborn will deal better with these than with dates. This is a handy trick for doing all kind of mathy things with dates and libraries that don't love dates.
Make a plot with the ordinals on the date axis
Replace the ordinal X-axis labels with nice, readable dates
ta-daa!
Since Seaborn has trouble with dates, I'm going to create a work-around. First, I'll make the Date column my index:
Second, convert the index to pd.DatetimeIndex:
And replace the original with it:
Third, reindex with the new index (idx):
This will produce a new dataframe with NaN values for the dates you don't have data:
Fourth, since Seaborn doesn't play nice with dates and regression lines I'll create a row count column that we can use as our x-axis:
Fifth, we should now be able to plot a regression line using 'row_count' as our x variable and 'amount' as our y variable:
Sixth, if you would like the dates to be along the x-axis instead of the row_count you can set the x-tick labels to the index:
Hope this helps!