I am new to decision trees. I am planning to build a large decision tree that I would like to update later with additional data. What is the best approach to this? Can any decision tree be later updated?
标签:
decision-tree
相关问题
- How to solve “The data cannot have more levels tha
- Finding a corresponding leaf node for each data po
- Decision tree using continuous variable
- How to implement decision trees in boosting
- How to visualize an XGBoost tree from GridSearchCV
相关文章
- How to display the path of a Decision Tree for tes
- Extract rule path of data point through decision t
- Parse a CSV file using python (to make a decision
- Calculating prediction accuracy of a tree using rp
- Extract and Visualize Model Trees from Sparklyr
- How does the C4.5 Algorithm handle continuous data
- How to deal with missing attribute values in C4.5
- GraphViz's executables not found : Anaconda-3
Decision trees are most often trained on all available data. That is, when you have new data, you retrain the entire tree. Since this process is very fast it is usually not problematic. If data is too big to fit in memory, you can often get around it by subsampling (row sampling) the training set, since tree-based models don't need that much data to give good results.
Note that decision trees are quite vunerable to overfitting, and you should consider Random Forest or another ensemble method. With bagging it is possible to train different trees on different subsets of data.
There also exists incremental and online learning methods for decision trees. CART, ID3 and VFDT learner are some examples.