Transform y axis in bar plot using scale_y_log10()

2020-02-15 23:06发布

Using the data.frame below, I want to have a bar plot with y axis log transformed.

I got this plot

enter image description here

using this code

ggplot(df, aes(x=id, y=ymean , fill=var, group=var)) +
  geom_bar(position="dodge", stat="identity",
           width = 0.7,
           size=.9)+
  geom_errorbar(aes(ymin=ymin,ymax=ymax),
                size=.25,   
                width=.07,
                position=position_dodge(.7))+
  theme_bw()

to log transform y axis to show the "low" level in B and D which is close to zero, I used

+scale_y_log10()

which resulted in

enter image description here

Any suggestions how to transform y axis of the first plot?

By the way, some values in my data is close to zero but none of it is zero.

UPDATE

Trying this suggested answer by @computermacgyver

ggplot(df, aes(x=id, y=ymean , fill=var, group=var)) +
  geom_bar(position="dodge", stat="identity",
           width = 0.7,
           size=.9)+
  scale_y_log10("y",
                breaks = trans_breaks("log10", function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x)))+
  geom_errorbar(aes(ymin=ymin,ymax=ymax),
                size=.25,   
                width=.07,
                position=position_dodge(.7))+
  theme_bw()

I got

enter image description here

DATA

dput(df)
structure(list(id = structure(c(7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 
2L, 6L, 6L, 6L, 5L, 5L, 5L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c("A", 
"B", "C", "D", "E", "F", "G"), class = "factor"), var = structure(c(1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L), .Label = c("high", "medium", "low"), class = "factor"), 
    ymin = c(0.189863418, 0.19131948, 0.117720496, 0.255852069, 
    0.139624146, 0.048182771, 0.056593774, 0.037262727, 0.001156667, 
    0.024461299, 0.026203592, 0.031913077, 0.040168571, 0.035235902, 
    0.019156667, 0.04172913, 0.03591233, 0.026405094, 0.019256055, 
    0.011310755, 0.000412414), ymax = c(0.268973856, 0.219709677, 
    0.158936508, 0.343307692, 0.205225352, 0.068857143, 0.06059596, 
    0.047296296, 0.002559633, 0.032446541, 0.029476821, 0.0394, 
    0.048959184, 0.046833333, 0.047666667, 0.044269231, 0.051, 
    0.029181818, 0.03052381, 0.026892857, 0.001511628), ymean = c(0.231733739333333, 
    0.204891473333333, 0.140787890333333, 0.295301559666667, 
    0.173604191666667, 0.057967681, 0.058076578, 0.043017856, 
    0.00141152033333333, 0.0274970166666667, 0.0273799226666667, 
    0.0357511486666667, 0.0442377366666667, 0.0409452846666667, 
    0.0298284603333333, 0.042549019, 0.0407020586666667, 0.0272998796666667, 
    0.023900407, 0.016336106, 0.000488014)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -21L), .Names = c("id", 
"var", "ymin", "ymax", "ymean"))

标签: r ggplot2
3条回答
劫难
2楼-- · 2020-02-15 23:37

As @Miff has written bars are generally not useful on a log scale. With barplots, we compare the height of the bars to one another. To do this, we need a fixed point from which to compare, usually 0, but log(0) is negative infinity.

So, I would strongly suggest that you consider using geom_point() instead of geom_bar(). I.e.,

ggplot(df, aes(x=id, y=ymean , color=var)) +
  geom_point(position=position_dodge(.7))+
  scale_y_log10("y",
                breaks = trans_breaks("log10", function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x)))+
  geom_errorbar(aes(ymin=ymin,ymax=ymax),
                size=.25,   
                width=.07,
                position=position_dodge(.7))+
  theme_bw()

dot plots are better than bars with log scale

If you really, really want bars, then you should use geom_rect instead of geom_bar and set your own baseline. That is, the baseline for geom_bar is zero but you will have to invent a new baseline in a log scale. Your Plot 1 seems to use 10^-7.

This can be accomplished with the following, but again, I consider this a really bad idea.

ggplot(df, aes(xmin=as.numeric(id)-.4,xmax=as.numeric(id)+.4, x=id, ymin=10E-7, ymax=ymean, fill=var)) +
  geom_rect(position=position_dodge(.8))+
  scale_y_log10("y",
                breaks = trans_breaks("log10", function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x)))+
  geom_errorbar(aes(ymin=ymin,ymax=ymax),
                size=.25,   
                width=.07,
                position=position_dodge(.8))+
  theme_bw()

Really bad idea of how to have a barplot with a log scale

查看更多
一纸荒年 Trace。
3楼-- · 2020-02-15 23:41

If you need bars flipped, maybe calculate your own log10(y), see example:

library(ggplot2)
library(dplyr)

# make your own log10
dfPlot <- df %>% 
  mutate(ymin = -log10(ymin),
         ymax = -log10(ymax),
         ymean = -log10(ymean))

# then plot
ggplot(dfPlot, aes(x = id, y = ymean, fill = var, group = var)) +
  geom_bar(position = "dodge", stat = "identity",
           width = 0.7,
           size = 0.9)+
  geom_errorbar(aes(ymin = ymin, ymax = ymax),
                size = 0.25,   
                width = 0.07,
                position = position_dodge(0.7)) +
  scale_y_continuous(name = expression(-log[10](italic(ymean)))) + 
  theme_bw() 

enter image description here

查看更多
The star\"
4楼-- · 2020-02-15 23:50

Firstly, don't do it! The help file from ?geom_bar says:

A bar chart uses height to represent a value, and so the base of the bar must always be shown to produce a valid visual comparison. Naomi Robbins has a nice article on this topic. This is why it doesn't make sense to use a log-scaled y axis with a bar chart.

To give a concrete example, the following is a way of producing the graph you want, but a larger k will also be correct but produce a different plot visually.

k<- 10000  

ggplot(df, aes(x=id, y=ymean*k , fill=var, group=var)) +
  geom_bar(position="dodge", stat="identity",
           width = 0.7,
           size=.9)+
  geom_errorbar(aes(ymin=ymin*k,ymax=ymax*k),
                size=.25,   
                width=.07,
                position=position_dodge(.7))+
  theme_bw() + scale_y_log10(labels=function(x)x/k)

k=1e4

Plot when k=1e4

k=1e6

enter image description here

查看更多
登录 后发表回答