I know the basics like ==
and !=
, or even the difference (vaguely) between &
and &&
. But stuff like %in%
and %%
and some stuff used in the context of sprintf()
, like sprintf("%.2f", x)
stuff I have no idea about.
Worst of all, they're hard to search for on the Internet because they're special characters and I don't know what they're called...
There are several different things going on here with the percent symbol:
Binary Operators
As several have already pointed out, things of the form %%
, %in%
, %*%
are binary operators (respectively modulo, match, and matrix multiply), just like a +
, -
, etc. They are functions that operate on two arguments that R recognizes as being special due to their name structure (starts and ends with a %
). This allows you to use them in form:
Argument1 %fun_name% Argument2
instead of the more traditional:
fun_name(Argument1, Argument2)
Keep in mind that the following are equivalent:
10 %% 2 == `%%`(10, 2)
"hello" %in% c("hello", "world") == `%in%`("hello", c("hello", "world"))
10 + 2 == `+`(10, 2)
R just recognizes the standard operators as well as the %x%
operators as special and allows you to use them as traditional binary operators if you don't quote them. If you quote them (in the examples above with backticks), you can use them as standard two argument functions.
Custom Binary Operators
The big difference between the standard binary operators and %x%
operators is that you can define custom binary operators and R will recognize them as special and treat them as binary operators:
`%samp%` <- function(e1, e2) sample(e1, e2)
1:10 %samp% 2
# [1] 1 9
Here we defined a binary operator version of the sample function
"%" (Percent) as a token in special function
The meaning of "%"
in function like sprintf
or format
is completely different and has nothing to do with binary operators. The key thing to note is that in those functions the %
character is part of a quoted string, and not a standard symbol on the command line (i.e. "%"
and %
are very different). In the context of sprintf
, inside a string, "%"
is a special character used to recognize that the subsequent characters have a special meaning and should not be interpreted as regular text. For example, in:
sprintf("I'm a number: %.2f", runif(3))
# [1] "I'm a number: 0.96" "I'm a number: 0.74" "I'm a number: 0.99"
"%.2f"
means a floating point number (f
) to be displayed with two decimals (.2
). Notice how the "I'm a number: "
piece is interpreted literally. The use of "%"
allows sprintf
users to mix literal text with special instructions on how to represent the other sprintf
arguments.
The R Language Definition, section 3.1.4 refers to them as "special binary operators". One of the ways they're special is that users can define new binary operators using the %x%
syntax (where x
is any valid name).
The Writing your own functions section of An Introduction to R, refers to them as Binary Operators (which is somewhat confusing because +
is also a binary operator):
10.2 Defining new binary operators
Had we given the bslash()
function a different name, namely one of the
form
%anything%
it could have been used as a binary operator in expressions
rather than in function form. Suppose, for example, we choose ! for
the internal character. The function definition would then start as
> "%!%" <- function(X, y) { ... }
(Note the use of quote marks.) The function could then be used as X %!% y. (The backslash symbol itself
is not a convenient choice as it presents special problems in this
context.)
The matrix multiplication operator, %*%, and the outer product matrix
operator %o% are other examples of binary operators defined in this
way.
They don’t have a special name as far as I know. They are described in R operator syntax and precedence.
The %anything%
operators are just normal functions, which can be defined by yourself. You do need to put the name of the operator in backticks (`…`
), though: this is how R treats special names.
`%test%` = function (a, b) a * b
2 %test% 4
# 8
The sprintf
format strings are entirely unrelated, they are not operators at all. Instead, they are just the conventional C-style format strings.
The help file, and the general entry, is indeed a good starting point: ?'%in%'
For example, you can see how the operator '%in%'
is defined:
"%in%" <- function(x, table) match(x, table, nomatch = 0) > 0
You can even create your own operators:
'%ni%' <- Negate('%in%')