I am trying to train GMM-UBM model from data that i have already extracted for emotion recognition with SIDEKIT(pretty much the same as speaker recognition. I also don't understand the HDF5 feature file system). My data is an ndarray with shape (1101,78) [78 are the number of acoustic features and 1101 the number of feature vectors(frames).
ubm = sidekit.Mixture()
llks = ubm.EM_uniform(anger, distribNb, iteration_min=3, iteration_max=10, llk_gain=0.01, do_init=True)
The error that is thrown is:
line 394, in _compute_all
self.A = (numpy.square(self.mu) * self.invcov).sum(1) - 2.0 * (numpy.log(self.w) + numpy.log(self.cst))
ValueError: operands could not be broadcast together with shapes (512,78) (512,0)
which means that the covariance matrix is of shape (512,0). Is that wrong? Should it be like (512,78)? I may be wrong. Please give me a hint
You might have figured it out already, but I thought I might as well post a possible solution to this.
The following code creates random data with dimensions (2,100) and tries to train a 128-mixture gmm using the EM_uniform algorithm:
However, this results in the same error as you have reported: ValueError: operands could not be broadcast together with shapes (128,100) (128,0)
I suspect there is some bug in how gmm.invcov is calculated in Sidekit.Mixture._init_uniform(), so I have figured out a manual initialization of the mixture with code from Sidekit.Mixture._init() (the initialization function for the EM_split()-algorithm).
The following code ran without errors on my computer:
This gave the following output: [-31.419146414931213, 54.759037708692404, 54.759037708692404, 54.759037708692404], which is the log-likelihood values after each iteration (convergence after 4 iterations. Do note that this example data is way to small to train a gmm on.)
I cannot guarantee this leads to any errors later on, leave a comment if that is the case!
As for HDF5-files, check out the the h5py documentation for tutorials. Also, hdfview allows you to look into contents of the h5-files, which is pretty convenient for debugging later on when you get to scoring.
What's the content of the parameter 'feature_list' that the sidekit.UBM takes?