Is Error-Based Pruning Redeemable?
Document Type
Article
Publication Date
2003
Keywords
Decision tree, pruning, error based pruning, reduced error pruning
Digital Object Identifier (DOI)
https://doi.org/10.1142/S0218213003001228
Abstract
Error based pruning can be used to prune a decision tree and it does not require the use of validation data. It is implemented in the widely used C4.5 decision tree software. It uses a parameter, the certainty factor, that affects the size of the pruned tree. Several researchers have compared error based pruning with other approaches, and have shown results that suggest that error based pruning results in larger trees that give no increase in accuracy. They further suggest that as more data is added to the training set, the tree size after applying error based pruning continues to grow even though there is no increase in accuracy. It appears that these results were obtained with the default certainty factor value. Here, we show that varying the certainty factor allows significantly smaller trees to be obtained with minimal or no accuracy loss. Also, the growth of tree size with added data can be halted with an appropriate choice of certainty factor. Methods of determining the certainty factor are discussed for both small and large data sets. Experimental results support the conclusion that error based pruning can be used to produce appropriately sized trees with good accuracy when compared with reduced error pruning.
Was this content written or created while at USF?
Yes
Citation / Publisher Attribution
International Journal on Artificial Intelligence Tools, v. 12, issue 3, p. 249-264
Scholar Commons Citation
Hall, Lawrence O.; Bowyer, Kevin W.; Banfield, Robert E.; Eschrich, Steven; and Collins, Richard, "Is Error-Based Pruning Redeemable?" (2003). Computer Science and Engineering Faculty Publications. 141.
https://digitalcommons.usf.edu/esb_facpub/141