Adding a legend to a ggplot2 geom_tile plot based

2019-05-22 00:36发布

问题:

I am creating fluctuation diagrams in ggplot2 using geom_tile, and would like to add a legend for size. I am at a loss as to how to do so. Here is a MWE:

library(dplyr)
library(ggplot2)

# create data frame of total number of passengers in each Sex-Age group

df <- data.frame(Titanic) %>% group_by(Sex, Age) %>%
    summarise (freq = sum(Freq))

# calculate the lengths of the sides of the tiles so the largest has
# area = 1 and the others are smaller proportional to frequency

df$tileside <- sqrt(df$freq / max(df$freq))

df

## Source: local data frame [4 x 4]
## Groups: Sex [?]
## 
##      Sex    Age  freq  tileside
##   (fctr) (fctr) (dbl)     (dbl)
## 1   Male  Child    64 0.1959396
## 2   Male  Adult  1667 1.0000000
## 3 Female  Child    45 0.1643003
## 4 Female  Adult   425 0.5049248
# using geom_tile, no size legend

ggplot(df, aes(x = Sex, y = Age, 
               height = tileside, width = tileside)) +
    geom_tile() + coord_fixed (ratio = 1)

I should mention that one alternative would be to use geom_point instead of geom_tile (see this post: https://stats.stackexchange.com/questions/56322/graph-for-relationship-between-two-ordinal-variables/56357#56357)

Here is a MWE of this approach:

ggplot(df, aes(x = Sex, y = Age, size = freq)) +
    geom_point(shape = 15) + coord_fixed (ratio = 1)

The problem is that the squares are too small, and if I rescale them, with scale_size(), I lose the most important feature of the fluctation diagram–that the area of the squares is proportional to frequency. (I’m not sure if this condition is met even without rescaling – it’s hard to tell how the area is calculated).

Many thanks for your help.

回答1:

As for the size issue, you can use scale_size_area. I added color as well.

ggplot(df, aes(x = Sex, y = Age, size = freq, color = freq)) +
  geom_point(shape = 15)  + coord_fixed (ratio = 1) + 
  scale_size_area(max_size = 20) +
  scale_color_gradient(high="red", low="black", guide = "legend") 



标签: r ggplot2