I am trying to create a cross-product matrix of unique users in R. I searched for it on SO but could not find what I was looking for. Any help is appreciated. I have a large dataframe (over a million) and a sample is shown:
df <- data.frame(Products=c('Product a', 'Product b', 'Product a',
'Product c', 'Product b', 'Product c'),
Users=c('user1', 'user1', 'user2', 'user1',
'user2','user3'))
Output of df is:
Products Users
1 Product a user1
2 Product b user1
3 Product a user2
4 Product c user1
5 Product b user2
6 Product c user3
I would like to see two matrices: The first one will show the number of unique users that had either products(OR) - so the output will be something like:
Product a Product b Product c
Product a 2 3
Product b 2 3
Product c 3 3
The second matrix will be the number of unique users that had both products(AND):
Product a Product b Product c
Product a 2 1
Product b 2 1
Product c 1 1
Any help is appreciated.
Thanks
UPDATE:
Here is more clarity: Product a is used by User1 and User2. Product b is used by User1 and User2 and Product c is used by User1 and User3. So in the first matrix, Product a and Product b will be 2 since there are 2 unique users. Similarly, Product a and Product c will be 3. Where as in the second matrix, they would be 2 and 1 since I want the intersection. Thanks
Try
Or using
outer
For the second case