Linear Regression calculation several times in one

I am using R to evaluate climate data and I have a data set that looks like the following miniaturized version... please forgive my crude posting etiquette, I hope this post is understandable.

[0][STA.NAME] [YEAR] [SUM.CDD]  
1 NAME1 1967    760  
2 NAME1 1985    800  
3 NAME1 1996    740  
4 NAME1 2003    810  
5 NAME1 2011    790  
6 NAME2 1967    700  
7 NAME2 1985    690  
8 NAME2 1996    850  
9 NAME2 2003    790  
10 NAME3    1967    760  
11 NAME3    1985    800  
12 NAME3    1990    740  
13 NAME3    1996    810  
14 NAME3    2003    790  
15 NAME3    2011    800

I am trying to return a new DF with this

[STA.NAME] [Eq'n of trend]  
NAME1  (y = mx + b)  
NAME2  (y = mx + b)

etc...

Eventually I will need to calculate variance of the trends, as well as total variance of data and would like to eventually append those to this resulting data set for something like...

[STA.NAME] [TREND] [VAR.TREND] [VAR.DATA]   
with values in rows, 1 for each STA.NAME...

Any help is greatly appreciated, If there is a better way than lm(), with which I am currently stumped, I would be interested as well.

Thank you very much,

Jesse

标签： r statistics linear-regression

1条回答

smile是对你的礼貌

2楼-- · 2019-08-09 16:20

Here is a simple solution using ddply() from plyr to return the coefficients for each group:

First replicate the data:

x <- read.table(text="
STA.NAME YEAR SUM.CDD  
1 NAME1 1967    760  
2 NAME1 1985    800  
3 NAME1 1996    740  
4 NAME1 2003    810  
5 NAME1 2011    790  
6 NAME2 1967    700  
7 NAME2 1985    690  
8 NAME2 1996    850  
9 NAME2 2003    790  
10 NAME3    1967    760  
11 NAME3    1985    800  
12 NAME3    1990    740  
13 NAME3    1996    810  
14 NAME3    2003    790  
15 NAME3    2011    800  ", header=TRUE)

Now do the modelling:

library(plyr)
ddply(x, .(STA.NAME), function(z)coef(lm(SUM.CDD ~ YEAR, data=z)))

  STA.NAME (Intercept)      YEAR
1    NAME1   -444.8361 0.6147541
2    NAME2  -6339.2047 3.5702200
3    NAME3   -995.2381 0.8928571

Now, depending on what you want to do, it may be simpler (and perhaps more meaningful) to create a single model of your data:

fit <- lm(SUM.CDD ~ YEAR + STA.NAME, data=x)

Get a summary:

summary(fit)

Call:
lm(formula = SUM.CDD ~ YEAR + STA.NAME, data = x)

Residuals:
   Min     1Q Median     3Q    Max 
-63.57 -22.21  10.72  18.62  80.72 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)  
(Intercept)   -2065.6401  1463.5353  -1.411   0.1858  
YEAR              1.4282     0.7345   1.945   0.0778 .
STA.NAMENAME2   -15.8586    27.5835  -0.575   0.5769  
STA.NAMENAME3     3.9046    24.7089   0.158   0.8773  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 40.8 on 11 degrees of freedom
Multiple R-squared: 0.3056, Adjusted R-squared: 0.1162 
F-statistic: 1.614 on 3 and 11 DF,  p-value: 0.2424

Extract only the coefficients:

coef(fit)
  (Intercept)          YEAR STA.NAMENAME2 STA.NAMENAME3 
 -2065.640078      1.428247    -15.858650      3.904632

Finally, you perhaps wanted to fit a model with interaction terms. This model gives you effectively the same results as the original plyr solution. Depending on your data and your objectives, this might be the way to do it:

fit <- lm(SUM.CDD ~ YEAR * STA.NAME, data=x)
summary(fit)

Call:
lm(formula = SUM.CDD ~ YEAR * STA.NAME, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-57.682 -13.166  -1.012  23.006  63.046 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)
(Intercept)         -444.8361  2280.7464  -0.195    0.850
YEAR                   0.6148     1.1447   0.537    0.604
STA.NAMENAME2      -5894.3687  3661.9795  -1.610    0.142
STA.NAMENAME3       -550.4020  3221.8390  -0.171    0.868
YEAR:STA.NAMENAME2     2.9555     1.8406   1.606    0.143
YEAR:STA.NAMENAME3     0.2781     1.6172   0.172    0.867

Residual standard error: 39.17 on 9 degrees of freedom
Multiple R-squared: 0.4763, Adjusted R-squared: 0.1854 
F-statistic: 1.637 on 5 and 9 DF,  p-value: 0.2451

0人赞添加讨论(0) 举报

Linear Regression calculation several times in one

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间