Local Data Characteristics in Learning Classifiers from Imbalanced Data

Jerzy Błaszczyński; Jerzy Stefanowski

doi:10.1007/978-3-319-67946-4_2

System Informacji Naukowej Politechniki Poznańskiej

PL EN

Strona główna / Publikacje / Local Data Characteristics in Learning Classifiers from Imbalanced Data

Zgłoś uwagę

Rozdział

Pobierz BibTeX

Tytuł

Local Data Characteristics in Learning Classifiers from Imbalanced Data

Autorzy

Jerzy Błaszczyński (WI) ^{[ 1 ][ 2.3 ][ P ]}
Jerzy Stefanowski (WI) ^{[ 1 ][ 2.3 ][ P ]}

^{[ 1 ]} Instytut Informatyki, Wydział Informatyki, Politechnika Poznańska | ^{[ P ]} pracownik

Dyscyplina naukowa (Ustawa 2.0)

[2.3] Informatyka techniczna i telekomunikacja

Rok publikacji

2018

Typ rozdziału

rozdział w monografii naukowej

Język publikacji

angielski

Streszczenie

EN Learning classifiers from imbalanced data is still one of challenging tasks in machine learning and data mining. Data difficulty factors referring to internal and local characteristics of class distributions deteriorate performance of standard classifiers. Many of these factors may be approximated by analyzing the neighbourhood of the learning examples and identifying different types of examples from the minority class. In this paper, we follow recent research on developing such methods for assessing the types of examples which exploit either k-nearest neighbours or kernels. We discuss the approaches to tune the size of both kinds of neighborhoods depending on the data set characteristics and evaluate their usefulness in series of experiments with real-world and synthetic data sets. Furthermore, we claim that the proper analysis of these neighborhoods could be the basis for developing new specialized algorithms for imbalanced data. To illustrate it, we study generalizations of over-sampling in pre-processing methods and neighbourhood based ensembles.

Strony (od-do)

51 - 85

DOI

10.1007/978-3-319-67946-4_2

URL

https://link.springer.com/chapter/10.1007/978-3-319-67946-4_2

Książka

Advances in Data Analysis with Computational Intelligence Methods

Punktacja Ministerstwa / rozdział

20

System tworzony przez Politechnikę Poznańską oraz Poznańskie Centrum Superkomputerowo-Sieciowe

Zaloguj się przez eKonto, aby dodać do SIN