-->

How to find frequent itemset irrespective of attri

2019-09-06 02:41发布

问题:

I have a dataset (CSV file) to find frequent itemsets using Apriori algorithm.

col1, col2, col3
bread, butter,?
coke, bread, butter

I am using WEKA for this purpose. The ouput is in the following format:

...
Large Itemsets L(2):
col1=bread  col2= butter 1
col1=coke  col2= bread 1
col1=coke  col3= butter 1
col2= bread  col3= butter 1
...

But the output that I am want is :

bread, butter 2

Basically, the above output is independent of the col that they belong to. How can I achieve this kind of output?

回答1:

Format your data differently.

Weka expects columns to be the same products, and the value to be t/f (for true, false). Then you get itemset of the kind milk=t -> butter=t.

See the .arff examples included with Weka.

I think I saw an ELKI example using your input format.