I have tried to apply this QA: "efficient looping logistic regression in R" to my own problem but I cannot quite make it work. I haven't tried to use apply, but I was told by a few people that a for loop is the best here (if someone believes otherwise please feel free to explain!) I think this problem is pretty generalizeable and not too esoteric for the forum.
This is what I want to achieve: I have a dataset with 3 predictor variables (gender, age, race) and a dependent variable (a proportion) for 86 genetic positions for several people. I want to run bivariate linear regressions for each position (so 86 linear regressions for 3 predictor variables). Then I want to output the results in some easily legible format; my idea is a matrix with rows=gender, age, and race, and columns=the 86 positions. There would be a p value for each row*column combination. Then I could call the p values<0.1 (or whatever threshold I want) to easily see which predictors are significantly associated with proportion at each position.
This is the code I have so far.
BB <- seq.csv[,6:91] #the data frame containing the 86 positions
AA <- seq.csv[,2:4] #the data frame containing the 3 predictor variables
linreg <- matrix(NA,3,86) #make a results vector and fill it with NA
for (i in 1:86) #loop over each position variable
{
for (j in 1:3) #for each position variable, loop over each predictor
{
linreg[i,j] <- lm(BB[,i]~AA[,j]) #bivariate linear regression
}}
No matter how I change this (for example, simplifying it to loop over the positions for only one predictor), I still get an error that my matrices are not the same length (number of items to replace is not a multiple of replacement length). In fact, length(linreg)=286 (3*86) and length(BB)=86 and length(AA)=3. I know the latter two are dataframes, not matrices...but if I convert them to matrices I get an invalid type error (invalid type (list) for variable 'BB[, i]'). I do not know how to resolve this error because I just don't understand R well enough...I've consulted the books Applied Statistical Genetics with R and Art of R Programming to no avail, and I'm been Google searching all day. And I haven't even gotten to the coding for outputting the results...
I'd appreciate any debugging tips or some suggestions on a better way to code this! Thank you all in advance.