I have been trying, for some time, to build a matrix populated by the counts of elements in common between two herarchical lists.
Here is some dummy data:
site<-c('A','A','A','A','A','A','A','A','A','B','B','B','B','B','B')
group<-c('A1','A1','A2','A2','A2','A3','A3','A3','A3',
'B1','B1','B2','B2','B2','B2')
element<-c("red","orange","blue","black","white", "black","cream","yellow","purple","red","orange","blue","white","gray","salmon")
d<-cbind(site,group,element)
I created a list structure, assuming it would be procedural due to the different number os elements in each list. Also, since I don´t want every possible comparison between groups, but only between sites.
#first level list - by site
sitelist<-split(nodmod, list(nodmod$site),drop = TRUE)
#list by group
nestedlist <- lapply(sitelist, function(x) split(x, x[['mod']], drop = TRUE))
My intention is to create a table, or matrix with the number of element in common between groups from the two sites (my original data has additional sites). Like such:
A1 A2 A3
B1 2 0 0
B2 0 2 0
The nested nature of this problem is challenging to me. I am not as familiar with lists, as I´ve solved problems mostly using dataframes. My attempt boiled down to this. I felt it got close, but have many shortcomings with the correct syntax for loops.
t <- outer(1:length(d$A),
1:length(d$B),
FUN=function(i,j){
sapply(1:length(i),
FUN=function(x)
length(intersect(d$A[[i]]$element, d$B[[j]]$element)) )
})
Any help would be much appreciated. Apologies if a similar problem has been solved. I have scoured the internet, but have not found it, or did not comprehend the solution to make it transferable to mine.
Instead of
rowwise
you can use a vectorised function that will be (automatically) applied to each row, like this:A similar approach to @Parfait's using matrix multiplication. You may need to play around with the data generation to extend it to your application:
Results:
Consider matrix multiplication
x %*% y
(see?matmult
) by creating a helper matrix of unique element values by unique group values assigning ones in each corresponding cell. Then run the matrix multiplication as the transpose with itself, followed by subset of rows and columns:Even shorter version thanks to @Lamia: