I've got a problem that I know how to solve using SQL, but I'm looking to implement a solution in R with a new data set. I've been trying to figure out things with the reshape2 package, but I haven't had any luck with what I'm trying to accomplish. Here's my problem:
I have a dataset in which I need to look at all pairs of items that are together from within another group. I've created a toy example below to further explain.
BUNCH FRUITS
1 apples
1 bananas
1 mangos
2 apples
3 bananas
3 apples
4 bananas
4 apples
What I want is a listing of all possible pairs and sum the frequency they occur together within a bunch. My output would ideally look like this:
FRUIT1 FRUIT2 FREQUENCY
APPLES BANANAS 3
APPLES MANGOS 1
My end goal is to make something that I'll eventually be able to import into Gephi for a network analysis. For this I need a Source and Target column (aka FRUIT1 and FRUIT2 above).
The original solution in SQL is here if that would help anyone: PROC SQL in SAS - All Pairs of Items