Plot Event sequences / Event Sequences Clustering

2019-01-24 23:22发布

问题:

Perhaps this is a veru dull question, but I did my research on it and couldn't find an answer.

I want to plot my event sequences in the same way we plot sequences of states using seqIplot, seqfplot, seqdplot and seqmtplot. seqplot in general.

when I try to do so I get the message :

Error: data is not a sequence object, use seqdef function to create one

Leading me to believe that those functions only apply to sequence of states.

In the user's guide section 10 on they only give examples of plots of subsequences. But those are not quite the same.

1) Is there a way to generate the mentioned plots for event sequences ? in which the states would be the transitions.

Also when I try to compute a distance matrix I get a similar error:

Error:  [!] data is not a state sequence object, use 'seqdef' function to create one

2) Isn't it possible to compute distance matrices and then apply clustering methods to event sequences ?

thanks !

回答1:

You are right. The seqplot family of functions are for state sequences only.

To plot event sequences as state sequences, you have to first transform them into state sequences.

Assuming your event sequences are in the TSE format (vertical time-stamped event form) as the actcal.tse example file provided by TraMineR, you can convert them into state sequences using TSE_to_STS of the companion TraMineRextras package.

For the transformation , you have to specify in which state you are after each event. You do that by creating a transformation matrix with the seqe2stm function. Each cell of that matrix should give the new state which results when the column event (column name) occurs while we are in the corresponding row state (row name).

To illustrate, here is the example from the help page of TSE_to_STS

data(actcal.tse)
events <- c("PartTime", "NoActivity", "FullTime", "LowPartTime")

## States defined by last occurred event (forgetting all previous events).
stm <- seqe2stm(events, dropList=list("PartTime"=events[-1],
           NoActivity=events[-2], FullTime=events[-3],
           LowPartTime=events[-4]))

mysts <- TSE_to_STS(actcal.tse[1:100,], id=1, timestamp=2, event=3,
           stm=stm, tmin=1, tmax=12, firstState="None")

Once you have your state sequences in STS form, you can create the state sequence object and plot them.

my.seq <- seqdef(mysts)
seqdplot(my.seq)

Alternatively, you can make a parallel coordinate plot of your event sequence using the seqpcplot function. There are plenty of example on the help page of that function. For details on the plot refer to

Bürgin, R. & Ritschard, G. (2014), "A decorated parallel coordinate plot for categorical longitudinal data", The American Statistician. Vol. 68(2), pp. 98-103. doi

Hope this helps.



回答2:

regarding the clustering of event sequences, you can use the seqedist function of the companion TraMineRextras package.

data(actcal.tse)
actcal.seqe <- seqecreate(actcal.tse[1:200,])[1:6,]
## We have 8 different event in this dataset
idcost <- rep(1, 8)
dd <- seqedist(actcal.seqe, idcost=idcost, vparam=.1)

For explanation on the distance, you can look at the paper

Ritschard, G., Bürgin, R. & Studer, M. (2013), "Exploratory Mining of Life Event Histories", In McArdle, J.J. & Ritschard, G. (eds) Contemporary Issues in Exploratory Data Mining in the Behavioral Sciences. Series: Quantitative Methodology, pp. 221-253. New York: Routledge. Preprint



标签: r traminer