To make clear what I'm asking I've created an easy example. Step one is to create some data:
gender <- factor(rep(c(1, 2), c(43, 41)), levels = c(1, 2),labels = c("male", "female"))
numberofdrugs <- rpois(84, 50) + 1
geneticvalue <- rpois(84,75)
death <- rpois(42,50) + 15
y <- data.frame(death, numberofdrugs, geneticvalue, gender)
So these are some random dates merged to one data.frame
. So from these dates I'd like to plot a cloud where I can differ between the males and females and where I add two simple regressions (one for females and one for males). So I've started, but I couldn't get to the point where I want to be. Please see below what I've done so far:
require(lattice)
cloud(y$death~y$numberofdrugs*geneticvalue)
xmale <- subset(y, gender=="male")
xfemale <- subset(y, gender=="female")
death.lm.male <- lm(death~numberofdrugs+geneticvalue, data=xmale)
death.lm.female <- lm(death~numberofdrugs+geneticvalue, data=xfemale)
How can I make different points for males or females when using the cloud command (for example blue and pink points instead of just blue crosses) and how can I add the two estimated models to the cloud graph?
Any thought is appreciated! Thanks for your ideas!
Answer to the first half of your question, "How can I make different points for males or females when using the cloud command (for example blue and pink points insted of just blue crosses)?"
cloud( death ~ numberofdrugs*geneticvalue , groups=gender, data=y )
The meta-answer to this may involve some non-3d visualization. Perhaps you can use lattice or ggplot2 to split the data into small multiples? It will likely be more comprehensible and likely easier to add the regression results.
splom( ~ data.frame( death, numberofdrugs, geneticvalue ), groups=gender, data=y )
The default splom panel function is panel.pairs, and you could likely modify it to add a regression line without an enormous amount of trouble.
ggplot2 does regressions within the plot matrix easily, but I can't get the colors to work.
pm <- plotmatrix( y[ , 1:3], mapping = aes(color=death) )
pm + geom_smooth(method="lm")
And finally, if you really want to do a cloudplot with a regression plane, here's a way to do it using the scatterplot3d package. Note I changed the data to have a little more interesting structure to see:
numberofdrugs <- rpois( 84, 50 ) + 1
geneticvalue <- numberofdrugs + rpois( 84, 75 )
death <- geneticvalue + rpois( 42, 50 ) + 15
y <- data.frame( death, numberofdrugs, geneticvalue, gender )
library(scatterplot3d)
pts <- as.numeric( as.factor(y$gender) ) + 4
s <-scatterplot3d( y$death, y$numberofdrugs, y$geneticvalue, pch=pts, type="p", highlight.3d=TRUE )
fit <- lm( y$death ~ y$numberofdrugs + y$geneticvalue )
s$plane3d(fit)
There is nice fit visualization in car package using rgl package (openGL implementation):
require(car)
require(rgl)
scatter3d(death~numberofdrugs+geneticvalue, groups=y$gender, data=y, parallel=FALSE)