I have a list of names like:
names = ['A', 'B', 'C', 'D']
and a list of documents, that in each documents some of these names are mentioned.
document =[['A', 'B'], ['C', 'B', 'K'],['A', 'B', 'C', 'D', 'Z']]
I would like to get an output as a matrix of co-occurrences like:
A B C D
A 0 2 1 1
B 2 0 2 1
C 1 2 0 1
D 1 1 1 0
There is a solution (Creating co-occurrence matrix) for this problem in R, but I couldn't do it in Python. I am thinking of doing it in Pandas, but yet no progress!
Here is another solution using
itertools
and theCounter
class from thecollections
module.The output (which could be easilty turned into a DataFrame) is: