reorder x-axis variables by sorting a subset of th

2019-07-13 18:07发布

I am trying to make a stacked bar chart and I would like to reorder the variables on the x-axis based on the data from a single category. In the example below there are three x values each of which have values corresponding to three categories. How would it be possible to plot the graph while sorting the name values for increasing abundance of "bb".

Although this question is similar to other questions about reordering categorical variables the difference here is that the ordering is based on a subset of one column's data. Any suggestions appreciated.

#create the dataframe
name = c('a', 'a', 'a', 'b', 'b', 'b','c','c','c') 
cat = c("aa", "bb", "cc", "aa", "bb", "cc","aa", "bb", "cc") 
percent = c( 5 , 5, 90, 40, 40 , 20, 90,5,5) 
df = data.frame(name, cat, percent)

#stacked barchart with default ordering   
ggplot(df, aes(x=name,y=percent, fill=cat)) + geom_bar(position="fill")

#I'm looking to reorder the x-axis by the `percent` values for category "bb"
vals = df[ df$cat == 'bb', ]                     #subset
xvals = vals[with(vals, order(percent)), ]$name  #get values
ggplot(df, aes(x =reorder(name, xvals ), y = percent, fill=cat])) + geom_bar(position="fill") #order with new values

标签: r ggplot2
3条回答
虎瘦雄心在
2楼-- · 2019-07-13 18:54
df$name2 <- factor(df$name, levels = xvals)
ggplot(df, aes(x = name2, y = percent, fill = cat)) +
  geom_bar(stat = "identity", position = "fill")

enter image description here

查看更多
疯言疯语
3楼-- · 2019-07-13 19:00

like this...?

df$name2 = factor(df$name, levels = levels(df$name), labels = xvals)
ggplot(df, aes(x = name2, y = percent, fill=cat)) + geom_bar(position="fill") 
查看更多
Ridiculous、
4楼-- · 2019-07-13 19:02

There are two problems here. The first is that you want to resort based on the percent in bb. The second is tha ggplot always sorts a categorical x-axis alphabetically, so you need to get around that.

First, to resort your data, ironically you need to transform to wide format, sort, and then re-transform to long format:

zz <- dcast(df,name~cat)         # columns for aa, bb, cc
yy <- zz[order(zz$bb),]          # order by bb
yy <- cbind(id=1:nrow(yy),yy)    # add an id column; will need later
gg <- melt(yy,id.vars=c("id","name"),variable.name="cat",value.name="percent")

Then:

ggplot(gg, aes(x=factor(id),y=percent, fill=cat))+
  geom_bar()+
  scale_x_discrete(labels=gg$name)+
  labs(x="name")

Produces this:

查看更多
登录 后发表回答