How can I use the row.names attribute to order the

2019-03-14 17:04发布

I created a random forest and predicted the classes of my test set, which are living happily in a dataframe:

row.names   class  
564028      1
275747      1
601137      0
922930      1
481988      1
...

The row.names attribute tells me which row is which, before I did various operations that scrambled the order of the rows during the process. So far so good.

Now I would like get a general feel for the accuracy of my predictions. To do this, I need to take this dataframe and reorder it in ascending order according to the row.names attribute. This way, I can compare the observations, row-wise, to the labels, which I already know.

Forgive me for asking such a basic question, but for the life of me, I can't find a good source of information regarding how to do such a trivial task.

The documentation implores me to:

use attr(x, "row.names") if you need to retrieve an integer-valued set of row names.

but this leaves me with nothing but NULL.

My question is, how can I use row.names which has been loyally following me around in the various incarnations of dataframes throughout my workflow? Isn't this what it is there for?

8条回答
不美不萌又怎样
2楼-- · 2019-03-14 17:20

If you have only one column in your dataframe like in my case you have to add drop=F:

df[ order(rownames(df)) , ,drop=F]
查看更多
虎瘦雄心在
3楼-- · 2019-03-14 17:25

This worked for me:

new_df <- df[ order(row.names(df)), ]
查看更多
太酷不给撩
4楼-- · 2019-03-14 17:27

Assuming your data frame is named 'df'you can create a new ordered data frame 'ord.df' that will contain the row names of df as well as it values in the following one line of code:

>ord.df<-cbind(rownames(df)[order(rownames(df))], df[order(rownames(df)),])
查看更多
聊天终结者
5楼-- · 2019-03-14 17:30

None of the solutions would actually work. It should be:

df[ order(as.numeric(row.names(df))),] #assuming the data frame is called df

because rowname in R is 'character', when the as.numeric part is missing it arrange the data as 1, 10, 11 ... so on.

查看更多
我想做一个坏孩纸
6楼-- · 2019-03-14 17:33

This will be done almost automatically since the "[" function will display in lexical order of any vector that can be matched to rownames():

df[ rownames(df) , ]

You might have thought it would be necessary to use:

df[ order(rownames(df)) , ]

But that would have given you an ordering of 1:100 of 1,10,100, 12,13, ...,2,20,21, ... , because the argument to "[" gets coerced to character.

查看更多
ゆ 、 Hurt°
7楼-- · 2019-03-14 17:33

you can simply sort your df by using this :

df <- df[sort(rownames(df)),]

and then do what you want !

查看更多
登录 后发表回答