Using geom_abline() and ggplot

2020-08-14 07:35发布

问题:

I am a beginner in ggplot2--it's been only 4 days since I have started experimenting with it. So, I apologize if this question sounds too basic. I'd appreciate any guidance--I've been struggling with this issue for about an hour.

I was experimenting with using geom_abline() as below:

 p <- ggplot(mpg, aes(cty, hwy)) + geom_point()
 p + geom_abline() + facet_wrap(~cyl)

This works as in I can see a reference line in all four faceted graphs as below:

Later, I was using another related dataset mtcars to see what happens to geom_abline()

  p <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
     p + geom_abline() + facet_wrap(~cyl)

However, when I ran this command, I couldn't see geom_abline(). Quite surprisingly, I found similar version of above command in the help file, and it says "geom_abline() is out of the range"

While I know what "out of range" means, but how do I know whether in a particular dataset, abline() will be out of range? I can override it by forcing it to use a specific slope and an intercept, but then I'd consider this a bit of hacking--i.e. modifying the code after seeing the output. Is there any way I can know what happens behind the scenes for geom_abline()

Here's the graph I got without any abline()s

I'd appreciate any thoughts. I am really confused.

回答1:

If you don't give any arguments, geom_abline() uses default values, intercept = 0 and slope = 1.

ggplot(mpg, aes(cty, hwy)) + geom_point() + geom_abline() + lims(x = c(0,35), y = c(0,50))
ggplot(mtcars, aes(wt, mpg)) + geom_point() + geom_abline() + lims(x = c(0,6), y = c(0,35))



回答2:

The difference between the first plot and the 2nd plot, in terms of the presence of the abline, is not determined by differences in datasets. You asked:

how do I know whether in a particular dataset, abline() will be out of range?

It actually depends on the scales of two variables thrown in your aesthetics (i.e your two mapping variables). If you notice carefully in the script plotting the mpg data, you're comparing hwy with city. Let's examine each of these closer:

range(mpg$hwy)
[1] 12 44
range(mpg$cty)
[1]  9 35

As you can see, the range of the variables overlaps quite a bit. Now let's look at the two variables mapped in the second plot using the mtcars data -mpg and wt:

range(mtcars$mpg)
[1] 10.4 33.9
range(mtcars$wt)
[1] 1.513 5.424

Here you can see there is zero overlap and that's why you don't see it when you graph it like below:

Now let's test notion of range using data that is in range but drawn from the mtcars dataset, in order to show its not the dataset but the overlapping ranges of the variables chosen to map. In mtcars lets stick with mpg and then pick another variable within its range -how bout qsec? Let's check their ranges:

range(mtcars$mpg)
[1] 10.4 33.9
range(mtcars$qsec)
[1] 14.5 22.9

OK, there is definitely some overlap there. Let's map it and see what happens:

p <- ggplot(mtcars, aes(qsec, mpg)) + geom_point() + geom_abline() + facet_wrap(~cyl)

Some additional confirmation it's not dataset dependent. Let's reverse our workflow and instead identify non-overlapping pairs in the mpg dataset. Here, we should expect to see no abline. Let's stick with hwy and compare it to displ. Take my word for it, they do not overlap:

ggplot(mpg, aes(displ, hwy)) + geom_point()+geom_abline() + facet_wrap(~cyl)

And as others pointed out (@cuttlefish44), geom_abline() defaults with an intercept of 0 and slope of 1).

Try using geom_smooth() for lines of fit.

 p <- ggplot(mpg, aes(cty, hwy)) + geom_point()
 p + geom_smooth(method = "lm", se = FALSE) + facet_wrap(~cyl)



回答3:

Look at the help for geom_abline(). There's an example with the same data set that shows — without the facets — that:

p + geom_abline() # Can't see it - outside the range of the data

Try:

p <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
p + geom_abline(intercept = 20) + facet_wrap(~cyl)


标签: r ggplot2