ORIGINAL TABLE
CELL NUMBER ----------ACTIVITY--------TIME<br/>
001................................call a................12.23<br/>
002................................call b................01.00<br/>
002................................call d................01.09<br/>
001................................call b................12.25<br/>
003................................call a................12.23<br/>
002................................call a................02.07<br/>
003................................call b................12.25<br/>
REQUIRED-
To mine the highest occurring sequence of ACTIVITY from a data-set of size 400,000
ABOVE EXAMPLE SHOULD SHOW
[call a-12.23,call b-12.25] frequency 2<br/>
[call b-01.00,call d-01.09,call a-02.07] frequency 1
I'm aware that this can be achieved using arulesSequences
. What transformations on dataset do i need to carry out and how so as to use the arulesSequences
package?
Current db format- transaction with 3 columns like sample above.