I am trying to use lme function from nlme package inside a for loop. I have tried (almost) everything now, but without any luck. Without the loop my lme function are working fine. I have 681 different lipids to analyse, so i need the loop.
Bonus info:
- I have used str() and my data has the same lengths before the loop
A simplified version of my data look like this:
>dput(head("ex.lme(loop)"))
structure(list(Lacal.Patient.ID = c(12L, 12L, 12L, 13L, 13L,
13L), Time = c(0L, 1L, 3L, 0L, 1L, 3L), Remission = c(0L, 0L, 1L, 0L, 0L, 1L), Age = c(46L, 43L, 36L, 47L, 34L, 45L), SEX = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("f", "m"), class = "factor"), BMI = c(25L, 26L, 23L, 27L, 26L, 27L), Sph = c(0.412, 1.713,
1.48, 0.735, 1.025, 1.275), S1P = c(2.412, 3.713, 3.48, 2.735,
3.025, 3.275), Cer..C16. = c(1.4472, 2.2278, 2.088, 1.641,
1.815, 1.965)), .Names = c("Lacal.Patient.ID", "Time", "Remission", "Age",
"SEX", "BMI", "Sph", "S1P", "Cer..C16."), row.names = c(NA, 6L
), class = "data.frame")
Here is what i do i R:
library(nlme)
attach(cer_data)
Remission <- factor(Remission)
Time <- factor(Time)
SEX <- factor(SEX)
How I think the loop should look like:
lipid <-as.matrix(cer_data[,c(7:9)]) # my lipids a at row 7-9in my data
beg <- 1
end <- nrow(lipid)
dim(lipid)
for (i in beg:end) {
print(paste("Running entity: ", colnames(lipid)[i], " which is ",i, " out of", end))
variable <- as.numeric(lipid[i])
lme_cer <- lme(variable ~ Remission + Time + Age + BMI + SEX, random = ~1|Lacal.Patient.ID, method = "REML", data = cer_data)
}
Error : Error in model.frame.default(formula = ~variable + Remission + Time + : variable lengths differ (found for 'Remission’)
Without the loop my analysis are working fine (Lipid(x) is just one of the lipids) :
lme_cer <- lme(lipid(x) ~ Remission + Time + Age + BMI + SEX , random = ~1 | Lacal.Patient.ID, method = "REML", data = cer_data)
summary(lme_cer)
Can anyone see the problem with my loop? I am not used to programming or using R, so there is probably some stupid mistakes.
A blind answer, assuming that your dependent variables are organized in columns and not in rows (as I think they are).
The main difference between my approach and your approach is that I loop over the names of the lipids rather than their position in the data set. This allows me (a) to construct a temporary data set in a less error-prone way, and (b) to construct a temporary formula for the fixed-effects part of your model.
The lme
function is then applied to the temporary data set with the temporary formula, and the result is saved in a list for easier access.
# names of lipids
lipid.names <- colnames(cer_data)[1:881]
no.lipids <- length(lipid.names)
# create a named list to hold the fitted models
fitlist <- as.list(1:no.lipids)
names(fitlist) <- lipid.names
# loop over lipid names
for(i in lipid.names){
# print status
print(paste("Running entity:", i, "which is", which(lipid.names==i), "out of", no.lipids))
# create temporary data matrix and model formula
tmp <- cer_data[, c(i,"Remission","Time","Age","BMI","SEX","Local.Patient.ID")]
fml <- as.formula( paste( i, "~", paste(c("Remission","Time","Age","BMI","SEX"), collapse="+") ) )
# assign fit to list by name
fitlist[[i]] <- lme(fml, random=~1|Lacal.Patient.ID, method="REML", data=tmp)
}
In my opinion it's easiest to work with temporary objects that exactly contain what is needed at that iteration of the loop.
Note that I cannot check this solution for errors because you haven't supplied a reproducible example: Here's how.
Solution: My loop is working now with this simple code:
lipid <-as.data.frame(cer_data[,c(7:9)])
dim(lipid)
for (i in 1:length(lipid)) {
variable <- lipid[,i]
lme_cer <- lme(variable ~ factor(Remission) + Time + Age + BMI + SEX, random = ~1 | Lacal.Patient.ID, method = "REML", data = cer_data)
print(summary(lme_cer)$tTable)
}
Thank you all for the amazing help!
Without knowing your data, conceptually it should be sth like that
df <- data.frame(lipid = rep(c(LETTERS[1:4]), each = 4), x1 = c(rnorm(16, 10, 1)), x2 = c(rnorm(16, 20, 5) ))
df
for (i in levels(df$lipid)){
print(paste("MODEL", i, sep = ""))
df1 = subset(df, lipid == i)
model <- lm(x1~x2, data = df1 )
print(summary(model)$coef)
}