likert plot showing percentage values

2020-02-15 03:50发布

问题:

Below, I have R code that plots a likert plot.

set.seed(1234)
library(e1071)
probs <- cbind(c(.4,.2/3,.2/3,.2/3,.4),c(.1/4,.1/4,.9,.1/4,.1/4),c(.2,.2,.2,.2,.2))
my.n <- 100
my.len <- ncol(probs)*my.n
raw <- matrix(NA,nrow=my.len,ncol=2)
raw <- NULL
for(i in 1:ncol(probs)){
  raw <- rbind(raw, cbind(i,rdiscrete(my.n,probs=probs[,i],values=1:5)))
}

r <- data.frame( cbind(
  as.numeric( row.names( tapply(raw[,2], raw[,1], mean) ) ),
  tapply(raw[,2], raw[,1], mean),
  tapply(raw[,2], raw[,1], mean) + sqrt( tapply(raw[,2], raw[,1], var)/tapply(raw[,2], raw[,1], length) ) * qnorm(1-.05/2,0,1),
  tapply(raw[,2], raw[,1], mean) - sqrt( tapply(raw[,2], raw[,1], var)/tapply(raw[,2], raw[,1], length) ) * qnorm(1-.05/2,0,1)
))
names(r) <- c("group","mean","ll","ul")

gbar <- tapply(raw[,2], list(raw[,2], raw[,1]), length)

sgbar <- data.frame( cbind(c(1:max(unique(raw[,1]))),t(gbar)) )

sgbar.likert<- sgbar[,2:6]



require(grid)
require(lattice)
require(latticeExtra)
require(HH)
sgbar.likert<- sgbar[,2:6]


yLabels = c(expression(a[1*x]),expression(b[2*x]),expression(c[3*x]))

likert(sgbar.likert,
       scales = list(y = list(labels = yLabels)),
       xlab="Percentage",
       main="Example Diverging Stacked Bar Chart for Likert Scale",
       BrewerPaletteName="Blues",
       sub="Likert Scale")

That looks like below. However, I want to show percentage value of each category as shown in the below picture.

I have tried this:

likert(sgbar.likert,
       scales = list(y = list(labels = yLabels)),
       xlab="Percentage",
       main="Example Diverging Stacked Bar Chart for Likert Scale",
       BrewerPaletteName="Blues",
       plot.percent.low=TRUE, # added this one
       plot.percent.high=TRUE, # added this one, too
       sub="Likert Scale")

But, it does not have any affect and does not make any differences.

So, how to show percentage values for each category?

回答1:

AFAIK there isn't any parameter to achieve that, so you need to define a custom panel function, in this way :

### just reproduce your input
library(HH)
sgbar.likert <- data.frame(X1 = c(34L, 7L, 13L),X2 = c(1L, 4L, 13L),
                           X3 = c(7L, 84L, 24L), X4 = c(7L, 2L, 27L), X5 = c(51L, 3L, 23L))
yLabels = c(expression(a[1*x]),expression(b[2*x]),expression(c[3*x]))
### 

# store the original col names used in custom panel function
origNames <- colnames(sgbar.likert)

# define a custom panel function
myPanelFunc <- function(...){
  panel.likert(...)
  vals <- list(...)
  DF <- data.frame(x=vals$x, y=vals$y, groups=vals$groups)

  ### some convoluted calculations here...
  grps <- as.character(DF$groups)
  for(i in 1:length(origNames)){
    grps <- sub(paste0('^',origNames[i]),i,grps)
  }

  DF <- DF[order(DF$y,grps),]

  DF$correctX <- ave(DF$x,DF$y,FUN=function(x){
    x[x < 0] <- rev(cumsum(rev(x[x < 0]))) - x[x < 0]/2
    x[x > 0] <- cumsum(x[x > 0]) - x[x > 0]/2
    return(x)
  })

  subs <- sub(' Positive$','',DF$groups)
  collapse <- subs[-1] == subs[-length(subs)] & DF$y[-1] == DF$y[-length(DF$y)]
  DF$abs <- abs(DF$x)
  DF$abs[c(collapse,FALSE)] <- DF$abs[c(collapse,FALSE)] + DF$abs[c(FALSE,collapse)]
  DF$correctX[c(collapse,FALSE)] <- 0
  DF <- DF[c(TRUE,!collapse),]

  DF$perc <- ave(DF$abs,DF$y,FUN=function(x){x/sum(x) * 100})
  ###

  panel.text(x=DF$correctX, y=DF$y, label=paste0(DF$perc,'%'), cex=0.7)
}

# plot passing our custom panel function
likert(sgbar.likert,
       scales = list(y = list(labels = yLabels)),
       xlab="Percentage",
       main="Example Diverging Stacked Bar Chart for Likert Scale",
       BrewerPaletteName="Blues",
       panel=myPanelFunc,
       sub="Likert Scale")

The code is pretty convoluted, but the key is that panel function receives, (in the ellipsis ... parameter), an x,y pair of coordinates for each bar and group factor for each of them (groups are the columns of the original likert input). The default panel function is panel.likert; so, after calling that, we can add our changes to the plotted panel (in this case the labels according to the bars coordinates).

Seems easy, but there are two problems :

  1. groups are redefined when they are even, so the central column, in this case "X3" it's split in two groups: "X3" and "X3 Positive".
  2. plotted bars are "stacked", so to correctly compute the centers of them (in order to put a label) you need to calculate the cumulated sum of the coordinates, using the original column names ordering.

The above code does all those calculations, hopefully in a quite generic manner (read: you can change the input and it should work...).