From asymmetric matrix (or dataframe) into a symme

2019-07-17 14:54发布

Given this data.frame: 'sample', which represents pairwise wins and losses among species:

     sp1<-c(0,1,0)
     sp3<-c(1,2,2)
     sp5<-c(3,1,0)
     sample<-as.data.frame(cbind(sp1,sp3,sp5))
     rownames(sample)<-c("sp1","sp6","sp8")

which should look like this:

    sp1 sp3 sp5
sp1   0   1   3
sp6   1   2   1
sp8   0   2   0

How do I modify 'sample' so that it has the same column names as rownames, and viceversa, and fill in the added columns or rows with zeros for the dataframe to be symmetric and look like shown below? (I prefer dataframe because I am afraid I am not good with matrices):

    sp1 sp3 sp5 sp6 sp8
sp1   0   1   3   0   0
sp3   0   0   0   0   0
sp5   0   0   0   0   0
sp6   1   2   1   0   0
sp8   0   1   0   0   0

The real data has around 150 rows and columns, so I don't really want to do it manually with excel. This format is required to apply some other function concerning competitive species interaction outcomes (columns:wins, rows: losses).

1条回答
兄弟一词,经得起流年.
2楼-- · 2019-07-17 15:37

The output you show doesn't seem to be a symmetric matrix, but if the desired output is what you're looking for, here is one way you can get it by using stack and xtabs. The key to making the "square" matrix is to make sure that the row and column names are "factored".

## Extract and sort the unique combination of row and column names.
## This will be used when creating our factors.
NAMES <- sort(unique(c(rownames(sample), colnames(sample))))
## "stack" your data.frame, reintroducing the rownames
##   which get dropped in the stacking process
temp <- data.frame(rows = rownames(sample), stack(sample))
## Your stacked data looks like this:
temp
#   rows values ind
# 1  sp1      0 sp1
# 2  sp6      1 sp1
# 3  sp8      0 sp1
# 4  sp1      1 sp3
# 5  sp6      2 sp3
# 6  sp8      2 sp3
# 7  sp1      3 sp5
# 8  sp6      1 sp5
# 9  sp8      0 sp5

## Factor the row and column names
temp$rows <- factor(temp$rows, NAMES)
temp$ind <- factor(temp$ind, NAMES)

## Use xtabs to get your desired output. Wrap it in
##    as.data.frame.matrix to get a data.frame as output
as.data.frame.matrix(xtabs(values ~ rows + ind, temp))
#     sp1 sp3 sp5 sp6 sp8
# sp1   0   1   3   0   0
# sp3   0   0   0   0   0
# sp5   0   0   0   0   0
# sp6   1   2   1   0   0
# sp8   0   2   0   0   0 
查看更多
登录 后发表回答