I have a conceptual problem.
I know what is a mel scale and what it represent and I know that this kind of spectrogram still has too much information for what I need.
I think that if we want reduce the number of information of the spectrogram we use the MFCC.
But I really don't get what the MFCC is and what it represent? I use a MFCC matrix in a speech recognition process, but I don't understand what all of the number inside that vector represent.
The array is 13x130 and I don't know what all these float mean. I understood that more long is my audio track bigger is my matrix (e.g 13x250, 13x400).
I hope that I make myself clear.