Calculate a trendline when the x-axis uses dates

2019-04-13 03:17发布

问题:

The post on calculating trend lines on a scatter plot (How do I calculate a trendline for a graph?) is quite helpful, but I'm curious how one could go about finding a trend line on a graph where the x-axis is a DateTime field, rather than an integer. For example, consider the case of charting the number of subscribers to a mailing list over time:

Jan 1: 100 subscribers
Jan 2: 105 subscribers
Jan 5: 120 subscribers
Jan 10: 117 subscribers
etc...

The problem I'm running into is figuring out the 'run' (delta x) portion of this... since the intervals are not going to be evenly spaced, we can't just assume a single unit of time passing between each measurement. I've got a hunch that I'll have to work out some sort of scale, but I'm stuck there.

Can anyone explain how to calculate a trendline when the x-axis is a DateTime field? (If you post a code sample, C#, VB.NET, or Java would be most appreciated!)

回答1:

You'll have to do a sort of linear interpolation. You need to convert the dates and times to a linear scale. The good news is that you get to pick this scale. So calculate how many minutes, or seconds, or hours... have passed since the start of your plot. You can then use this as the "run" portion.

In your example, we can go off of days:

Jan 1: 0 days, 100 subscribers Jan 2: 1 day, 105 subscribers Jan 5: 4 days, 120 subscribers jan 10: 9 days, 117 subscribers



回答2:

you can always convert the date to an integer

the web gives this example:

DateTime given = new DateTime(2008, 7, 31, 10, 0, 0);
TimeSpan t = given.Subtract(new DateTime(1970, 1, 1, 0, 0, 0, 0));
long unixTime = (long) t.TotalSeconds;


回答3:

Even though your samples are not evenly spaced, your trendline can still have a constant "run".

For example, you could choose 1 week intervals. In that case take the average number of subscribers for Jan 1,2, & 5 and plot a point on Jan 7. Next take the average for Jan 10 & 13, and plot a point on Jan 14. etc..

If you have a lot of historical data, use 1 month intervals, or maybe quarterly would better suit your data.



回答4:

Specifiy an origin date and consider it x = 1. Chose hours or 8 hour periods if you want to be more specific.

The first day:
jan 1, 2006 -> x = 1
45 days later
February 13th?? -> x = 45
2 years later:
jan 1, 2008 -> x = 730


标签: trending