Abstract:
In the preprocessing step of a knowledge discovery process, the method of discretization selected can have a remarkable impact on the performance and accuracy of classification algorithms. In this article, we analyze and compare expert discretization and automatic discretization algorithms. In particular, we study their impact to predict the survival of patients in the context of intensive care burn units. We focus on the quality of different discretizations algorithm analyzing the number of intervals generated, the amount of patterns produced and the classification performance in a specific clinical problem. Our results show that the many algorithms underperform expert discretization and that it is necessary to take into account the correlation among continuous features to obtain the best accuracy.