I am going through Hadley Wickham's "R for Data Science" where he uses ~var
in ggplot calls.
I understand y ~ a + bx
, where ~
describes a formula/relationship between dependent and independent variables, but what does ~var
mean? More importantly, why can't you just put the variable itself? See code below:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
or
demo <- tribble(
~cut, ~freq,
"Fair", 1610,
"Good", 4906,
"Very Good", 12082,
"Premium", 13791,
"Ideal", 21551
)
ggplot(data = demo) +
geom_bar(mapping = aes(x = cut, y = freq), stat = "identity")
It's just
ggplot
making use of theformula
structure to let the user decide what variables to facet on. From?facet_grid
:So
facet_grid(. ~ var)
just means to facet the grid on the variablevar
, with the facets spread over columns. It's the same asfacet_grid(col = vars(var))
.Despite looking like a
formula
, it's not really being used as a formula: it's just a way to present multiple arguments to R in a way that thefacet_grid
code can clearly and unambiguously interpret.It is a syntax specific to
facet_wrap
, where a formula can be given as the input for the variable relationships. From the documentation for the first argument,facets
:So I think you can now just give the variable names without the tilde, but you used to need to give a one-sided formula with the tilde.