I have a dataset in which a value (mean
) can or cannot fall within an interval given by lower.bound
and upper.bound
. I would like to plot this using lattice
and have achieved something really nice, but there are still three things missing, I don't know how to tackle (I am relatively new to lattice
).
df <- read.table("http://pastebin.com/raw.php?i=FQh6F12t")
require(lattice)
lattice.options(default.theme = standard.theme(color = FALSE))
head(df)
## code topic problem mean lower.bound upper.bound consistent
## 7 A04C coke MP 99.5 36.45 95.95 0
## 8 A04C coke MT 47.5 22.78 100.00 1
## 11 A04C girl MP 50.0 4.75 9.75 0
## 12 A04C girl MT 99.5 20.00 100.00 1
## 23 A14G coke MP 88.5 21.25 66.75 0
## 24 A14G coke MT 82.5 48.36 100.00 1
dotplot(lower.bound + mean + upper.bound ~ code | problem * topic,
data = df, pch = c(6, 3, 2), scales = list(x = list(draw = FALSE)),
as.table = TRUE)
This produces:
The down-arrows/triangles indicate the lower bound, the up arrows/triangles indicate the upper bound and the +
marks the mean
. The following things I would like to add to the plot but have no idea how (besides obviously customizing the panel function):
- Conditional
pch
based on whether or not amean
value is inside the interval. The variableconsistent
indicates this (0 = outside the interval).pch
should be1
for values inside and3
for values outside the interval. (pch for the lower- and upper bound should remain unchanged) - Marking the interval. I would like to draw a thicker line between the
lower.bound
andupper.bound
at each x-axis tick. - Add the proportion of values outside the interval to the panel headers (e.g.,
MP; 58.6%
to the panel in the upper left corner).
For 1 and 2 my problem obviously is, that I don't know how to deal with custom panel function when having multiple y variables (i.e., how to write conditional panel functions based on this). But I couldn't find anything on it.
For 3, the proportion of values outside the interval is given by something like:
1 - with(df, tapply(consistent, list(topic, problem), mean))
## MP MT
## coke 0.5862 0.1724
## girl 0.8276 0.1724
If the answer would furthermore include a nice ordering of levels on the x-axis that would definitely be a plus. The order can change in every panel (i.e., even in panels above each other the same x-axis tick can correspond to a different level of code
). But this is not important.
Well, this isn't real pretty, but it should get the real job (showing you how to get this kind of plot working) done.
The basic idea is to rewrite the formula so that it doesn't have a bunch of names on its LHS (i.e.
lower.bound + mean + upper.bound
). That syntax is equivalent to specifying agroups=
term, which ends up triggeringpanel.superpose()
which is kind of a pain to customize in the way you want.Instead, I just include
mean
on the LHS, and then usesubscripts
inside of the custom panel function to pick out in each case the matching elements ofupper.bound
andlower.bound
.I'm hoping the rest is pretty self explanatory: