I would like to know how can I come up with a lm
formula syntax that would enable me to use paste
together with cbind
for multiple multivariate regression.
Example
In my model I have a set of variables, which corresponds to the primitive example below:
data(mtcars)
depVars <- paste("mpg", "disp")
indepVars <- paste("qsec", "wt", "drat")
Problem
I would like to create a model with my depVars
and indepVars
. The model, typed by hand, would look like that:
modExmple <- lm(formula = cbind(mpg, disp) ~ qsec + wt + drat, data = mtcars)
I'm interested in generating the same formula without referring to variable names and only using depVars
and indepVars
vectors defined above.
Attempt 1
For example, what I had on mind would correspond to:
mod1 <- lm(formula = formula(paste(cbind(paste(depVars, collapse = ",")), " ~ ",
indepVars)), data = mtcars)
Attempt 2
I tried this as well:
mod2 <- lm(formula = formula(cbind(depVars), paste(" ~ ",
paste(indepVars,
collapse = " + "))),
data = mtcars)
Side notes
- I found a number of good examples on how to use
paste
with formula but I would like to know how I can combine withcbind
. - This is mostly a syntax a question; in my real data I've a number of variables I would like to introduce to the model and making use of the previously generated vector is more parsimonious and makes the code more presentable. In effect, I'm only interested in creating a formula object that would contain
cbind
with variable names corresponding to one vector and the remaining variables corresponding to another vector. - In a word, I want to arrive at the formula in
modExample
without having to type variable names.
All the solutions below use these definitions:
1) character string formula Create a character string representing the formula and then run
lm
usingdo.call
. Note that the the formula shown in the output displays correctly and is written out.giving:
1a) This would also work:
giving:
2) reformulate @akrun and @Konrad, in comments below the question suggest using
reformulate
. This approach produces a"formula"
object whereas the ones above produce a character string as the formula. (If this were desired for the prior solutions above it would be possible usingfo <- formula(fo)
.) Note that it is important that the response argument toreformulate
be a call object and not a character string or elsereformulate
will interpret the character string as the name of a single variable.giving:
3) lm.fit Another way that does not use a formula at all is:
The output is a list with these components:
Think it works.