Align barplot with boxplot in R

2019-02-23 20:11发布

问题:

I would like to plot a distribution of counts using the barplot function in R, and underlay it with a boxplot to include information on median, quartiles, and outliers. A not-too-elegant solution for this has been found for histogram and boxplots: http://rgraphgallery.blogspot.com/2013/04/rg-plotting-boxplot-and-histogram.html.

There are many places online where one can find the argument being made that numerical data should be plotted with histograms while categorical data should be plotted with bar plots. My data are numerical, and in fact on a ratio scale (as they are counts), but because they are discrete, I want columns with gaps, not columns that touch, which seems to be the only option for histogram().

I currently have the following, but bar- and boxplot do not align quite perfectly:

set.seed(476372)
counts1 <- rpois(10000,3)
nf <- layout(mat = matrix(c(1,2),2,1, byrow=TRUE),  height = c(3,1))
par(mar=c(3.1, 3.1, 1.1, 2.1))
barplot(prop.table(table(counts1)))
boxplot(counts1, horizontal=TRUE,  outline=TRUE,ylim=c(0,12), frame=F, width = 10)

Here my question: How can I make them align?

回答1:

Another option that's similar but a little more work. This preserves the option for gaps between the bars:

tbl <- prop.table(table(counts1))
left <- -0.4 + do.call('seq', as.list(range(counts1)))
right <- left + (2 * 0.4)
bottom <- rep(0, length(left))
top <- tbl
xlim <- c(-0.5, 0.5) + range(counts1)

nf <- layout(mat = matrix(c(1,2),2,1, byrow=TRUE),  height = c(3,1))
par(mar=c(3.1, 3.1, 1.1, 2.1))
plot(NA, xlim=xlim, ylim=c(0, max(tbl)))
rect(left, bottom, right, top, col='gray')
boxplot(counts1, horizontal=TRUE,  outline=TRUE, ylim=xlim, frame=F, width = 10)



回答2:

Maybe using a "fake" histogram at the end

ht=hist(counts1,breaks=12,plot = F)
ht$counts=as.numeric(table(counts1))
ht$density=as.numeric(prop.table(table(counts1)))
ht$breaks=as.numeric(names(table(counts1)))
ht$mids=sapply(1:(length(ht$breaks)-1),function(z)mean(ht$breaks[z:(z+1)]))

plot(ht,freq=F,col=3,main="")
boxplot(counts1, horizontal=TRUE,outline=TRUE,ylim=range(ht$breaks), frame=F, col="green1", width = 10)