Fine tuning a dotplot in R's lattice package

2019-03-31 04:22发布

I am trying to plot a bunch of ROC areas for different datasets and different algorithms. I have three variables: "Scheme" which specifies the algorithm used, "Dataset" is the dataseet that the algorithm is being tested on, and "Area_under_ROC".

I am using the lattice library in R, with the following command:

dotplot(Scheme ~ Area_under_ROC | Dataset, data = simulationSummary, layout = c(4,6))

and this is what I get:

dotplot of Scheme vs. Area_under_ROC conditioned on Dataset

What I'd like to know is

  • How can make the labels on the y-axis readable? Right now, they're all squeezed together.
  • How can I re-arrange the panel in such a way that the datasets marked with "100" form the last column, but the other columns stay the same?

I'd very much appreciate any comments or pointers. Many Thanks!

标签: r lattice
2条回答
做个烂人
2楼-- · 2019-03-31 04:48

Some ideas:

  1. Use a smaller font size for y-axis labels, e.g. scale=list(y=list(cex=.6)). An alternative would be to preserve uniform font size, but separate your output on several pages (this can be controlled with layout=), or, probably better, to show all data from the same dataset (A to F, hence 4 points for each algorithm) or by sample size (10 to 100, hence 6 points for each algorithm) with a group= option. I would personally create two factors, sample.size and dataset.type for that.
  2. Relevel your factor Dataset, so that the dataset you are interested appear where layout will put them, or (better!) use index.cond to specify a particular arrangement for your 24 panels. E.g.,

    dfrm <- data.frame(algo=gl(11, 1, 11*24, labels=paste("algo", 1:11, sep="")), 
                       type=gl(24, 11, 11*24, labels=paste("type", 1:24, sep="")),
                       roc=runif(11*24))
    p <- dotplot(algo ~ roc | type, dfrm, layout=c(4,6), scale=list(y=list(cex=.4)))
    

    will arrange panels in sequential order, from bottom left to top right (type1 in bottom left panel, type24 in top right panel), while

    update(p, index.cond=list(24:1))
    

    will arrange panels in the reverse order. Just specify a list with expected panel locations.


Here is an example of what I have in mind with Point 1 and the use of two factors instead of one. Let us generate another artificial dataset:

dfrm <- data.frame(algo=gl(11, 1, 11*24, labels=paste("algo", 1:11, sep="")),
                   dataset=gl(6, 11, 11*24, labels=LETTERS[1:6]),
                   ssize=gl(4, 11*6, 11*24, labels=c(10,25,50,100)), 
                   roc=runif(11*24))
xtabs(~ dataset + ssize, dfrm)  # to check allocation of factor levels 
dotplot(algo ~ roc | dataset, data=dfrm, group=ssize, type="l", 
        auto.key=list(space="top", column=4, cex=.8, title="Sample size", 
                      cex.title=1, lines=TRUE, points=FALSE))

enter image description here

查看更多
放荡不羁爱自由
3楼-- · 2019-03-31 04:49

Additionally to chl answer after splitting Dataset type to Type and Size you could use useOuterStrips function from latticeExtra package.

To get more space for labels you could "transpose" plot.

# prepare data:
simulationSummary$Dataset_type <- substr(simulationSummary$Dataset, 1, 5)
simulationSummary$Dataset_size <- substr(simulationSummary$Dataset, 6, 10)

# to gets proper order force factor levels:
simulationSummary$Dataset_size <- factor(simulationSummary$Dataset_size,
    levels = c("10", "25", "50", "100"))

library(latticeExtra)
useOuterStrips(dotplot(
     Scheme ~ Area_under_ROC | Dataset_type*Dataset_size,
     data = simulationSummary,
     layout = c(4,6)
))

Dotplot

Or use vertical dotplot:

useOuterStrips(dotplot(
     Area_under_ROC ~ Scheme | Dataset_size*Dataset_type,
     data = simulationSummary, horizontal=FALSE,
     layout = c(4,6), scales=list(x=list(rot=90))
))

enter image description here

查看更多
登录 后发表回答