GP-MOMS BASED CLASSIFICATION OF MULTI-CLASS IMBALANCED DATA
Prediction of rarely occurring patterns is challenging but crucial for several real-world applications like healthcare, fraud detection, etc. However, for datasets with imbalanced class distribution, the traditional techniques in Machine Learning focus mainly on frequently occurring patterns, and exhibit poor performance in classifying instances of underrepresented classes present in minority. Further, most research in this field focuses on binary classes only. But, several applications of interest involve multiple classes, which is much more complex than learning from bi-class imbalanced datasets. Hence, the proposed work addresses the issue of multi-class imbalanced data classification through a generic framework suitable for all application areas. Firstly, the work extends the bi-class evaluation measures to multi-class datasets for unbiased performance analysis. Further, a sampling and Genetic Programming based approach named GP-MOMS is proposed for efficient classification of multi-class imbalanced data, especially the rare patterns. Performance comparison with related benchmark techniques on standard datasets proves the efficacy of the proposed approach, which is presented in this work.
Classification, Minority Classes, Imbalanced Datasets, Multi-Class, Genetic Programming, Sampling.