I know there is a lot of material related to hidden markov model and I have also read all the questions and answers related to this topic. I understand how it works and how it can be trained, however I am not able to solve the following problem I am having when trying to train it for a simple dynamic gesture.
I am using HMM implementation for OpenCV I have looked into previously asked questions and answer here. Which has really helped me in understanding and using markov models.
I have total of two dynamic gestures, which are both symmetric (swipe left and swipe right) There are total of 5 observations in which 4 are the different stages in the gesture and 5th one is an observation when non of these stages are occuring.
Swipe left gesture consists of the following observation: 1->2->3->4 (which should trigger a swipe left state) Likewise Swipe Right gesture consists of the following observation: 4->3->2->1
I have 25 sequences. I am taking 20 observations for each of the sequence, which are used to train hidden markov model using Baum-Welch algorithm.
The following is the input sequence:
1 0 1 1 0 2 2 2 2 0 0 2 3 3 3 0 0 4 4 4
4 4 4 4 4 0 3 3 3 3 3 0 0 1 0 0 1 1 0 1
4 4 4 4 4 4 0 3 3 3 3 3 0 0 1 0 0 1 1 0
4 4 4 4 4 4 4 0 3 3 3 3 3 0 0 1 0 0 1 1
1 1 1 1 0 2 2 2 0 1 0 3 3 0 0 0 4 4 4 4
1 1 1 1 1 0 2 2 2 0 1 0 3 3 0 0 0 4 4 4
0 1 1 1 1 1 0 2 2 2 0 1 0 3 3 0 0 0 4 4
0 0 1 1 1 1 1 0 2 2 2 0 1 0 3 3 0 0 0 4
4 4 0 0 3 0 3 3 3 3 0 0 0 0 0 1 1 1 1 1
4 4 4 0 0 3 0 3 3 3 3 0 0 0 0 0 1 1 1 1
4 4 4 4 0 0 3 0 3 3 3 3 0 0 0 0 0 1 1 1
1 1 1 1 0 0 2 2 0 3 2 3 3 3 0 0 4 4 4 4
1 1 1 1 1 0 0 2 2 0 3 2 3 3 3 0 0 4 4 4
1 1 1 1 1 1 0 0 2 2 0 3 2 3 3 3 0 0 4 4
1 3 4 4 4 0 3 0 0 0 0 0 3 2 0 0 1 1 1 1
In this sequence you can see pattern for Swipe left and Swipe right gestures.
To train the hidden markov model I am initilizing it with the following values and then calling the train function to get the output:
TRANS:
0.7 0.15 0.15
0.3 0.4 0.3
0.3 0.4 0.3
EMIS:
0.3 0.1 0.1 0.1 0.1
0.2 0.1 0.2 0.2 0.3
0.2 0.3 0.2 0.2 0.1
INIT:
0.6 0.2 0.2
After training the output is:
TRANS:
0.81611 0.0847926 0.0990979
0.398458 0.346433 0.255109
0.371391 0.35587 0.272739
EMIS:
0.534127 0.125568 0.0824495 0.200169 0.0576869
0.294653 0.0250053 0.0500311 0.200616 0.429694
0.238808 0.075001 0.0500019 0.130455 0.505733
INIT:
0.443984 0.391323 0.164693
Using this model in my recognition program, I am not getting results. I want the system to remain in a NULL STATE unless one of the gesture is detected. In the Transition and Emission matrix I gave my guess values for both these gesture.
What do you think I might be doing wrong? Any pointers or help?
Lastly here is the code I am using for doing this (if anyone wants to have a look)
double TRGUESSdata[] = {0.7, 0.15, 0.15,
0.3, 0.4, 0.3,
0.3, 0.4, 0.3};
cv::Mat TRGUESS = cv::Mat(3,3,CV_64F,TRGUESSdata).clone();
double EMITGUESSdata[] = {0.3, 0.1, 0.1, 0.1, 0.1,
0.2, 0.1, 0.2, 0.2, 0.3,
0.2, 0.3, 0.2, 0.2, 0.1};
cv::Mat EMITGUESS = cv::Mat(3,5,CV_64F,EMITGUESSdata).clone();
double INITGUESSdata[] = {0.6 , 0.2 , 0.2};
cv::Mat INITGUESS = cv::Mat(1,3,CV_64F,INITGUESSdata).clone();
std::cout << seq.rows << " " << seq.cols << std::endl;
int a = 0;
std::ifstream fin;
fin.open("observations.txt");
for(int y =0; y < seq.rows; y++)
{
for(int x = 0; x<seq.cols ; x++)
{
fin >> a;
seq.at<signed int>(y,x) = (signed int)a;
std::cout << a;
}
std::cout << std::endl;
}
hmm.printModel(TRGUESS,EMITGUESS,INITGUESS);
hmm.train(seq,1000,TRGUESS,EMITGUESS,INITGUESS);
hmm.printModel(TRGUESS,EMITGUESS,INITGUESS);
Here fin is used to read the observation I have from my other code.