ENDER: A Statistical Framework for Boosting Decision Rules
Data Mining and Knowledge Discovery , Volume 21 - Issue 1 p. 59- 90
Induction of decision rules plays an important role in machine learning. Themain advantage of decision rules is their simplicity and human-interpretable form. Moreover, they are capable of modeling complex interactions between attributes. In this paper, we thoroughly analyze a learning algorithm, called ENDER, which constructs an ensemble of decision rules. This algorithm is tailored for regression and binary classification problems. It uses the boosting approach for learning, which can be treated as generalization of sequential covering. Each new rule is fitted by focusing on examples which were the hardest to classify correctly by the rules already present in the ensemble. We consider different loss functions and minimization techniques often encountered in the boosting framework. The minimization techniques are used to derive impurity measures which control construction of single decision rules. Properties of four different impurity measures are analyzed with respect to the trade-off between misclassification (discrimination) and coverage (completeness) of the rule. Moreover, we consider regularization consisting of shrinking and sampling. Finally, we compare the ENDER algorithm with other well-known decision rule learners such as SLIPPER, LRI and RuleFit.
|THEME||Life Sciences (theme 5), Logistics (theme 3)|
|Journal||Data Mining and Knowledge Discovery|
Dembczynski, K, Kotlowski, W.T, & Slowinski, R. (2010). ENDER: A Statistical Framework for Boosting Decision Rules. Data Mining and Knowledge Discovery, 21(1), 59–90.