回归对数据集的子集(Regression on subset of data set)

2019-07-18 13:51发布

我想做到以下几点,需要一些帮助:

计算斜率和截距为“高度”上方“年龄” [LM(身高〜年龄)]分别为

(A)每个单独的

(B)性别

并创建包含结果(斜率和截距)的表。 我可以用“应用”这个?

在下一步骤中,我想做了一个统计测试,以确定斜率和截距是性别之间显著不同。 我知道该怎么做了测试中的R但也许有斜坡/拦截计算和T-测试相结合的方式。

实施例的数据:

example = data.frame(Age = c(1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12), 
                Individual = c("Jack", "Jack", "Jack", "Jack", "Jack",
                               "Jill", "Jill", "Jill", "Jill", "Jill",
                               "Tony", "Tony", "Tony", "Tony", "Tony",
                               "Jen", "Jen", "Jen", "Jen","Jen"),
                    Gender = c("M", "M", "M", "M", "M",
                               "F", "F", "F", "F", "F",
                               "M", "M", "M", "M", "M",
                               "F", "F", "F", "F", "F"),
                    Height = c(38, 62, 92, 119, 165,
                               31, 59, 87, 118, 170,
                               45, 72, 93, 155, 171,
                               33, 61, 92, 115, 168))

Answer 1:

对每个级分别做回归分析,并在数据帧结合的斜率和截距的一种方式,是使用函数ddply()从文库plyr

library(plyr)

ddply(example,"Individual",function(x) coefficients(lm(Height~Age,x)))
  Individual (Intercept)      Age
1       Jack    26.29188 11.11421
2        Jen    22.10660 11.56345
3       Jill    18.33249 12.04315
4       Tony    33.02030 11.96447

ddply(example,"Gender",function(x) coefficients(lm(Height~Age,x)))
  Gender (Intercept)      Age
1      F    20.21954 11.80330
2      M    29.65609 11.53934


文章来源: Regression on subset of data set