R syntax for selecting all but two first rows

2019-02-11 12:58发布

How do I select all but the first two rows from e.g. the mtcars dataset?

I know that I can write no_mazda <- mtcars[3:32], which does work as long as I know the number of rows. But when I don't know the number of rows I need to write e.g. no_mazda <- mtcars[3:nrow(mtcars)] which of cause also works, but:

Does R provide a smarter syntax than an expression that includes mtcars twice?

标签: r syntax row
3条回答
狗以群分
2楼-- · 2019-02-11 13:07

I prefer using tail with negative values for n:

tail(mtcars,-2)
查看更多
闹够了就滚
3楼-- · 2019-02-11 13:14

Negative indices mean "skip":

mtcars[-(1:2)]

skips first 2 indices of vector mtcars. If you need to skip first 10, just use mtcars[-(1:10)].

Note that you speak about "dataset" but the code you use is for vectors, so I also responded is if mtcars is a vector. If mtcars is a dataframe and you are selecting rows, you have to use trailing comma:

mtcars[-(1:2),]
查看更多
家丑人穷心不美
4楼-- · 2019-02-11 13:27

If you happen to be using a data.table (and why would anyone not use it, if you are using a data.frame anyway?) - then you can use the handy .N operator (more info), which in essence contains the number of rows in your table.

Here's a working example:

# make sure you have data.table
install.packages("data.table")
library(data.table)

# load the mtcars data
data(mtcars)
# Make a data table out of the mtcars dataset
cars <- as.data.table(mtcars, keep.rownames = TRUE)

# Take all the rows from a given index (e.g. 5) to the end
> cars[5:.N]
                     rn  mpg cyl  disp  hp drat    wt  qsec vs am gear carb
 1:   Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
 2:             Valiant 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
 3:          Duster 360 14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
 4:           Merc 240D 24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2 

... (truncated)

Just swap that 5 for a 2 in order to get the OP's desired output.

This of course allows dynamic use for tables of varying lengths, without having to always use the length() function. For example, if you know that you always want take the last 5 rows of a table and remove the very last row - getting 4 rows as output - then you can do something like the following:

> cars[(.N-4):(.N-1)]    # note the expressions for slicing must be in parentheses
           rn  mpg cyl  disp  hp drat    wt qsec vs am gear carb
1:   Lotus Europa 30.4   4  95.1 113 3.77 1.513 16.9  1  1    5    2
2: Ford Pantera L 15.8   8 351.0 264 4.22 3.170 14.5  0  1    5    4
3:   Ferrari Dino 19.7   6 145.0 175 3.62 2.770 15.5  0  1    5    6
4:  Maserati Bora 15.0   8 301.0 335 3.54 3.570 14.6  0  1    5    8

Or simply always get the last row:

cars[.N]

... which is just as nice and concise as Python's equivalent: cars[-1])

查看更多
登录 后发表回答