I am performing my analysis using R, I will be implementing four algorithms.
1. RF
2. Log Reg
3. SVM
4. LDA
I have 50 predictors and 1 target variable. All my predictors and target variable are only binary numbers 0s and 1s.
I have the following questions:
Should I convert them all into factors?
Converting them into factors, and applying RF algorithms give 100% accuracy, I am very much surprised to see that as well.
Also, for other algorithms, how should i treat my variables priorly, before feeding them into my other algorithms.
Thanks
If you variables / predictors are categorical, then it is best to convert them to factors. Otherwise, it is likely they will be treated as numerical values.
If you are doing a classification task, then best to have the target / response variable as a factor as well.
It is also better to look at the documentation of the functions you use to make sure they will not convert factors to numerical values.
Use adaboost...
Take a look at different kaggle kernels, especially the Mercedes one, to get the idea of implementing adaboost.
https://www.kaggle.com/c/mercedes-benz-greener-manufacturing/kernels
The dataset is mixed of both numerical and factors and 0s,1s.