I am trying out this method as a regularized regression, as an alternative to lasso and elastic net. I have 40k data points and 40 features. Lasso selects 5 features, and orthogonal matching pursuit selects only 1.
What could be causing this? Am I using omp the wrong way? Perhaps it is not meant to be used as a regression. Please let me know if you can thing of anything else I may be doing wrong.
Orthogonal Matching Pursuit seems a bit broken, or at least very sensitive to input data, as implemented in scikit-learn.
Example:
Fun experiments:
There are a bunch of canned datasets in
sklearn.datasets
. Does OMP fail on all of them? Apparently, it works okay on the diabetes dataset...Is there any combination of parameters to
make_regression
that would generate data that OMP works for? Still looking for that one... 100 x 100 and 100 x 10 fail in the same way.