MySQL slope (trend) of single field (line of best

2019-07-21 21:31发布

问题:

I have a simple table called LOGENTRY with fields called "DATE" and "COST". Example:

+--------------+-------+
| DATE         | COST  |
+--------------+-------+
| MAY 1 2013   | 0.8   |
| SEP 1 2013   | 0.4   |
| NOV 1 2013   | 0.6   |
| DEC 1 2013   | 0.2   |
+--------------+-------+

I would like to find the slope of the COST field over time (a range of rows selected), resulting in SLOPE=-0.00216 (This is equivalent to Excel's SLOPE function, aka linear regression).

Is there a simple way to SELECT the slope of COST? If I do the math in the calling language (php) I can find slope as:

SLOPE =  (N * Sum_XY - Sum_X * Sum_Y)/(N * Sum_X2 - Sum_X * Sum_X);

I saw some similar questions posted but they are more complex. I'm trying to strip this example down to the simplest situation - so I can understand the answer :) Here's as close as I got...but MYSQL complains about the syntax near: 'float)) AS Sum_X, SUM(CAST(LOGENTRY.DATE as float) * CAST(LOGENTRY.DATE'

SELECT 
  COUNT( * ) AS N, 
  SUM( CAST( LOGENTRY.DATE AS FLOAT ) ) AS Sum_X, 
  SUM( CAST( LOGENTRY.DATE AS FLOAT ) * CAST( LOGENTRY.DATE AS FLOAT ) ) AS Sum_X2, 
  SUM( LOGENTRY.COST ) AS Sum_Y, SUM( LOGENTRY.COST * LOGENTRY.COST ) AS Sum_Y2, 
  SUM( CAST( LOGENTRY.DATE AS FLOAT ) * LOGENTRY.COST ) AS Sum_XY
FROM LOGENTRY

回答1:

It seems that MySQL cannot cast a date as float (as per the other examples in stackoverflow). Perhaps the other examples refer to another database. So by converting dates to unix_timestamps I am able to get an answer...with the final calculation in PHP. If this is WRONG...please post and I will remove answer...

SELECT
        COUNT(*) AS N,
        SUM(UNIX_TIMESTAMP(LOGENTRY.DATE)) AS Sum_X,
        SUM(UNIX_TIMESTAMP(LOGENTRY.DATE) * UNIX_TIMESTAMP(LOGENTRY.DATE)) AS Sum_X2,
        SUM(LOGENTRY.COST) AS Sum_Y,
        SUM(LOGENTRY.COST*LOGENTRY.COST) AS Sum_Y2,
        SUM(UNIX_TIMESTAMP(LOGENTRY.DATE) * LOGENTRY.COST) AS Sum_XY
    FROM LOGENTRY