In R - When Does Spacing Matter?

2019-07-31 19:39发布

问题:

When does spacing matter?

c(3, 5) is no different then c(3,5)

Two examples below. Please focus on the values = c("(-Inf,17]"... section. The spacing of this section produces wildly different results when I place spaces between Inf, 17, and 19. What is R doing? There must be a logic behind the spacing eliminating values that I just can't figure out. Try the two examples below and notice the different results.

library(ggplot2)
ggplot(mtcars, aes(wt, mpg)) + 
    geom_point(aes(colour = cut(qsec, c(-Inf, 17, 19, Inf))),
               size = 5) +
    scale_color_manual(name = "qsec",
                       values = c("(-Inf,17]" = "black",
                                  "(17,19]" = "yellow",
                                  "(19, Inf]" = "red"),
                       labels = c("<= 17", "17 < qsec <= 19", "> 19"))

vs

library(ggplot2)
ggplot(mtcars, aes(wt, mpg)) + 
    geom_point(aes(colour = cut(qsec, c(-Inf, 17, 19, Inf))),
               size = 5) +
    scale_color_manual(name = "qsec",
                       values = c("(-Inf, 17]" = "black",
                                  "(17,19]" = "yellow",
                                  "(19, Inf]" = "red"),
                       labels = c("<= 17", "17 < qsec <= 19", "> 19"))

回答1:

The reason the plots are different is because the values are the names of the levels you defined:

> cut(mtcars$qsec, c(-Inf, 17, 19, Inf)) -> my_factor
> levels(my_factor)
[1] "(-Inf,17]" "(17,19]"   "(19, Inf]"

So the values in the color scale need to match up, otherwise how would R know?

E.g., what if you had two levels named "(-Inf,17]" and "(-Inf, 17]"? If R ignored the space, how would R know which factor you refer to?



标签: r ggplot2