R: What are operators like %in

2019-01-13 19:24发布

问题:

I know the basics like == and !=, or even the difference (vaguely) between & and &&. But stuff like %in% and %% and some stuff used in the context of sprintf(), like sprintf("%.2f", x) stuff I have no idea about.

Worst of all, they're hard to search for on the Internet because they're special characters and I don't know what they're called...

回答1:

There are several different things going on here with the percent symbol:

Binary Operators

As several have already pointed out, things of the form %%, %in%, %*% are binary operators (respectively modulo, match, and matrix multiply), just like a +, -, etc. They are functions that operate on two arguments that R recognizes as being special due to their name structure (starts and ends with a %). This allows you to use them in form:

Argument1 %fun_name% Argument2

instead of the more traditional:

fun_name(Argument1, Argument2)

Keep in mind that the following are equivalent:

10 %% 2 == `%%`(10, 2)
"hello" %in% c("hello", "world") == `%in%`("hello", c("hello", "world"))
10 + 2 == `+`(10, 2)

R just recognizes the standard operators as well as the %x% operators as special and allows you to use them as traditional binary operators if you don't quote them. If you quote them (in the examples above with backticks), you can use them as standard two argument functions.

Custom Binary Operators

The big difference between the standard binary operators and %x% operators is that you can define custom binary operators and R will recognize them as special and treat them as binary operators:

`%samp%` <- function(e1, e2) sample(e1, e2)
1:10 %samp% 2
# [1] 1 9

Here we defined a binary operator version of the sample function

"%" (Percent) as a token in special function

The meaning of "%" in function like sprintf or format is completely different and has nothing to do with binary operators. The key thing to note is that in those functions the % character is part of a quoted string, and not a standard symbol on the command line (i.e. "%" and % are very different). In the context of sprintf, inside a string, "%" is a special character used to recognize that the subsequent characters have a special meaning and should not be interpreted as regular text. For example, in:

sprintf("I'm a number: %.2f", runif(3))
# [1] "I'm a number: 0.96" "I'm a number: 0.74" "I'm a number: 0.99"

"%.2f" means a floating point number (f) to be displayed with two decimals (.2). Notice how the "I'm a number: " piece is interpreted literally. The use of "%" allows sprintf users to mix literal text with special instructions on how to represent the other sprintf arguments.



回答2:

The R Language Definition, section 3.1.4 refers to them as "special binary operators". One of the ways they're special is that users can define new binary operators using the %x% syntax (where x is any valid name).

The Writing your own functions section of An Introduction to R, refers to them as Binary Operators (which is somewhat confusing because + is also a binary operator):

10.2 Defining new binary operators

Had we given the bslash() function a different name, namely one of the form

%anything%

it could have been used as a binary operator in expressions rather than in function form. Suppose, for example, we choose ! for the internal character. The function definition would then start as

> "%!%" <- function(X, y) { ... }

(Note the use of quote marks.) The function could then be used as X %!% y. (The backslash symbol itself is not a convenient choice as it presents special problems in this context.)

The matrix multiplication operator, %*%, and the outer product matrix operator %o% are other examples of binary operators defined in this way.



回答3:

They don’t have a special name as far as I know. They are described in R operator syntax and precedence.

The %anything% operators are just normal functions, which can be defined by yourself. You do need to put the name of the operator in backticks (`…`), though: this is how R treats special names.

`%test%` = function (a, b) a * b

2 %test% 4
# 8

The sprintf format strings are entirely unrelated, they are not operators at all. Instead, they are just the conventional C-style format strings.



回答4:

The help file, and the general entry, is indeed a good starting point: ?'%in%'

For example, you can see how the operator '%in%' is defined:

"%in%" <- function(x, table) match(x, table, nomatch = 0) > 0

You can even create your own operators:

'%ni%' <- Negate('%in%')