With scikit-learn, I have built a support vector machine, for a basic handwritten digit detection problem.
My total data set consists of 235 observations. My observations consist of 1025 features each. I know that one of the advantages of using a support vector machine is in situations like this, where there are a modest number of observations that have a large number of features.
After my SVM is created, I look at my confusion matrix (below)...
Confusion Matrix:
[[ 6 0]
[ 0 30]]
...and realize that holding out 15% of my data for testing (i.e., 36 observations) is not enough.
My problem is this: How can I work around this small data issue, using cross validation?