Brackets make a vector different. How exactly is v

2019-01-28 12:34发布

问题:

I have a data frame as follows:

planets               type diameter rotation rings
Mercury Terrestrial planet    0.382    58.64 FALSE 
Venus   Terrestrial planet    0.949  -243.02 FALSE 
Earth   Terrestrial planet    1.000     1.00 FALSE 
Mars    Terrestrial planet    0.532     1.03 FALSE
Jupiter          Gas giant   11.209     0.41 TRUE
Saturn          Gas giant     9.449     0.43 TRUE
Uranus          Gas giant     4.007    -0.72 TRUE
Neptune          Gas giant    3.883     0.67  TRUE

I wanted to select last 3 rows:

planets_df[nrow(planets_df)-3:nrow(planets_df),]

However, I've got something I didn't expect:

planets          type                  diameter rotation rings
Jupiter          Gas giant            11.209     0.41  TRUE
Mars             Terrestrial planet    0.532     1.03 FALSE
Earth            Terrestrial planet    1.000     1.00 FALSE
Venus            Terrestrial planet    0.949  -243.02 FALSE
Mercury          Terrestrial planet    0.382    58.64 FALSE

With trial and error method, I've learned that

> (nrow(planets_df)-3):nrow(planets_df)
[1] 5 6 7 8

and

> nrow(planets_df)-3:nrow(planets_df)
[1] 5 4 3 2 1 0

How does exactly R evaluate : statement (with reference to brackets)?

回答1:

The colon operator will take precedence over the arithmetic operations. It is always best to experiment with examples to internalize the logic:

2*2:6-1

What answer should we expect? Some would say 4 5. The thinking is that it will simplify to 2*2=4 and 6-1=5, therefore 4:5.

2*2:6-1
[1]  3  5  7  9 11

This answer will surprise anyone who hasn't considered the order of operations in play. The expression 2*2:6-1 is simplified differently. The sequence 2:6 is carried out first, then the multiplication, and finally the addition. We could write it out as 2 * (2 3 4 5 6), which is 4 6 8 10 12 and subtract 1 from that to get 3 5 7 9 11.

By grouping with parantheses we can control the order of operations as we would do similarly in basic arithmetic to get the answer that we first expected.

(2*2):(6-1)
[1] 4 5

You can apply this reasoning to your example to investigate the seemingly odd behavior of the : operator.

Now that you know the secret codes, what should we expect from (2*2):6-1?



回答2:

The colon : separates the starting point from the end point of a sequence. It is treated with higher priority than the + or - operator. Therefore,

nrow(planets_df)-3:nrow(planets_df)

is equal to

nrow(planets_df) - (3:nrow(planets_df))

If you want to have the last three entries using this syntax, you need to put the entire expression that defines the start of the sequence into brackets:

planets_df[(nrow(planets_df)-3):nrow(planets_df),]


回答3:

nrow(planets_df)-3:nrow(planets_df) is being evaluated as 8 - (3:8) or

(8-3) (8-4) (8-5) (8-6) (8-7) (8-8) = 5 4 3 2 1 0

For future reference if you want the last few rows, use tail(planets_df, 3)



标签: r vector subset