Selective pre-processing of imbalanced data for improving classification performance

Jerzy Stefanowski; Szymon Wilk

doi:10.1007/978-3-540-85836-2_27

System Informacji Naukowej Politechniki Poznańskiej

PL EN

Strona główna / Publikacje / Selective pre-processing of imbalanced data for improving classification performance

Zgłoś uwagę

Rozdział

Pobierz BibTeX

Tytuł

Selective pre-processing of imbalanced data for improving classification performance

Autorzy

Jerzy Stefanowski ^{[ 1 ][ P ]}
Szymon Wilk ^{[ 1 ][ P ]}

^{[ 1 ]} Instytut Informatyki (II), Wydział Informatyki i Zarządzania, Politechnika Poznańska | ^{[ P ]} pracownik

Rok publikacji

2008

Typ rozdziału

referat

Język publikacji

angielski

Streszczenie

EN In this paper we discuss problems of constructing classifiers from imbalanced data. We describe a new approach to selective pre-processing of imbalanced data which combines local over-sampling of the minority class with filtering difficult examples from the majority classes. In experiments focused on rule-based and tree-based classifiers we compare our approach with two other related pre-processing methods – NCR and SMOTE. The results show that NCR is too strongly biased toward the minority class and leads to deteriorated specificity and overall accuracy, while SMOTE and our approach do not demonstrate such behavior. Analysis of the degree to which the original class distribution has been modified also reveals that our approach does not introduce so extensive changes as SMOTE.

Strony (od-do)

283 - 292

DOI

10.1007/978-3-540-85836-2_27

URL

https://link.springer.com/chapter/10.1007/978-3-540-85836-2_27

Książka

Data Warehousing and Knowledge Discovery : 10th International Conference, DaWaK 2008 Turin, Italy, September 2008 : proceedings

Zaprezentowany na

10th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2008, 2-5.09.2008, Turin, Italy

System tworzony przez Politechnikę Poznańską oraz Poznańskie Centrum Superkomputerowo-Sieciowe

Zaloguj się przez eKonto, aby dodać do SIN