I'm having some trouble with the y formatting ranges. When I use scale_y_log10()
in my plot, it decides that having the scale 0.1, 10, 1000 is the way to do it. I really need it to display it as 1e-1, 1e1, 1e3. math_format
help page is not helpful without the format I need to know.
Anything I can answer I will.
You can use the breaks
and labels
parameters of scale_y_log10
as in
library(ggplot2)
ggplot(data=subset(movies, votes > 1000)) +
aes(x = rating, y = votes / 10000) +
scale_y_log10(breaks = c(0.1, 1, 10), labels = expression(10^-1, 10^0, 10^1)) +
geom_point()
This might not be an elegant solution, but it works if you only have a limited number of plots.
The problem is that R uses an not well-understood penalty mechanism for deciding whether to print in normal or scientific notation. This is decided by options( scipen )
.
The value represents the penalty R applies to the number of characters it would take to print in scientific notation vs. fixed point, so options( scipen = 3 )
would mean that R
adds 3 to the number of characters it takes to print say 1e2
and compares it to the number of characters it needs to print the fixed point equivalent and prints the number with the lower number of characters (so in this case 1e2
= 3 characters, + 3 penalty = 6, whereas 100
equals 3 characters so 100
gets printed. To fix you example just set options( scipen = -10 )
to always favour printing scientific notation over fixed point. So (using @PeterB's example) you can use scipen
which should allow you to not worry about manual break setting...
option( scipen = -10 )
ggplot(data=subset(movies, votes > 1000)) +
aes(x = rating, y = votes / 10000) +
geom_point()
The easiest way to achieve what you ask, with automatic limits and breaks, and without side-effects is this:
library(ggplot2)
library(MASS)
library(scales)
ggplot(data=subset(movies, votes > 1000)) +
aes(x = rating, y = votes / 10000) +
scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x, n=3),
labels = trans_format("log10")) +
geom_point()
I rather prefer to use superscripts for the powers of ten, and hide the minor grid,
and add ticks spaced according to logs. This is also rather easy to achieve:
ggplot(data=subset(movies, votes > 1000)) +
aes(x = rating, y = votes / 10000) +
scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x, n=3),
labels = trans_format("log10", math_format(10^.x))) +
theme(panel.grid.minor = element_blank()) +
annotation_logticks(sides="l") +
geom_point()
The code above is adapted from the examples in the annotation_logticks help, annotation_logticks. There is a lot of flexibilty for adjusting the exact format.