Is there a reason to prefer '&&' over '

2020-02-26 07:51发布

问题:

Yes I know, there have been a number of questions (see this one, for example) regarding the usage of & vs. && in R, but I have not found one that specifically answers my question.

As I understand the differences,

  • & does element-wise, vectorised comparison, much like the other arithmetic operations. It hence returns a logical vector that has length > 1 if both arguments have length > 1.
  • && compares the first elements of both vectors and always returns a result of length 1. Moreover, it does short-circuiting: cond1 && cond2 && cond3 && ... only evaluates cond2 if cond1 is TRUE, and so forth. This allows for things like if(exists("is.R") && is.function(is.R) && is.R()) and particularly means that using && is strictly necessary in some cases.

Moreover, if issues the warning

the condition has length > 1 and only the first element will be used

if its condition has more than one element.

Judging from these preliminaries, I'd consider it safer to prefer & to && in all if statements where short-circuiting isn't required.

If something went wrong during calculations and I accidentally have a vector in one of &'s arguments, I get a warning, which is good. If not, everything is fine as well.

If, on the other hand, I used &&, and something went wrong in my calculations and one of &&'s arguments is a vector, I don't get a warning. This is bad. If, for some reason, I really want to compare the first elements of two vectors, I'd argue that it's much cleaner to do so explicitly rather than implicitly.

Note that this is contrary to what seems to be common agreement between R programmers and contrary to what the R docs recommend. (1)

Hence my question: Are there any reasons except short-circuiting that make && preferable to & in if statements?


(1) Citing help(&&):

'&' and '&&' indicate logical AND and '|' and '||' indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined. The longer form is appropriate for programming control-flow and typically preferred in 'if' clauses.

回答1:

Short answer: Yes, the different symbol makes the meaning more clear to the reader.

Thanks for this interesting question! If I can summarize, it seems to be a follow-up specifically about this section of my answer to the question you linked,

... you want to use the long forms only when you are certain the vectors are length one. You should be absolutely certain your vectors are only length 1, such as in cases where they are functions that return only length 1 booleans. You want to use the short forms if the vectors are length possibly >1. So if you're not absolutely sure, you should either check first, or use the short form and then use all and any to reduce it to length one for use in control flow statements, like if.

I hear your question (given comments) this way: But & and && will do the same thing if the inputs are length one, so other than short-circuiting, why prefer &&? Perhaps & should be preferred because if they're not length one, if will give me a warning, helping me be even more certain that the inputs are length one.

First, I agree with the comment by @James that you may be "overstating the value of getting a warning"; if it's not length one, the safer thing will be to handle this appropriately, not to just plow ahead. You could make a case that && should throw an error if they're not length one, and perhaps a good case; I don't know the reason why it does what it does. But without going back in time, the best we can do now is to check that the inputs are indeed appropriate for your use.

Given then, that you have checked to make sure your inputs are appropriate, I would still recommend && because it semantically reminds me as the reader that I should be making sure the inputs are scalars (length one). I'm so used to thinking vector-ally that this reminder is helpful to me. It follows the principle that different operations should have different symbols, and for me, a operation that is meant for use on scalars is different enough than a vectorized operation that it warrants a different symbol.

(Not to start a flame war (I hope), but this is also why I prefer <- to =; one for assigning variables, one for setting parameters to functions. Although deep down this is the same thing, it's different enough in practice to make the different symbols helpful to me as a reader.)



回答2:

No, using && does not offer any advantages other than short-circuiting.

However, short-circuiting is very much preferable for control flow, so much so that it should be the default. if statements should not take vectorised arguments - that's what ifelse is for. If you are passing a logical vector into if typically you would be contracting it to a single logical value using any or all for the evaluation.

The major advantages of short circuiting are in avoiding lengthy or failure-prone steps (eg internet connections - though these should be dealt with through try):

#avoiding lengthy calculations
system.time(if(FALSE & {Sys.sleep(2);TRUE}) print("Hello"))
   user  system elapsed 
   0.00    0.00    1.99 
system.time(if(FALSE && {Sys.sleep(2);TRUE}) print("Hello"))
   user  system elapsed 
      0       0       0 

#avoiding errors
if(FALSE & {stop("Connection Failed");TRUE}) print("Success") else print("Condition not met")
Error: Connection Failed
if(FALSE && {stop("Connection Failed");TRUE}) print("Success") else print("Condition not met")
[1] "Condition not met"

It is clear that in order to take advantage of these features, you would have to know in advance which steps take the longest or are prone to errors and construct the logical statement appropriately.