Depending on the amount of data to process, file generation may take longer.

If it takes too long to generate, you can limit the data by, for example, reducing the range of years.

Chapter

Download BibTeX

Title

Dealing with Data Difficulty Factors While Learning from Imbalanced Data

Authors

[ 1 ] Instytut Informatyki, Wydział Informatyki, Politechnika Poznańska | [ P ] employee

Year of publication

2016

Chapter type

chapter in monograph

Publication language

english

Abstract

EN Learning from imbalanced data is still one of challenging tasks in machine learning and data mining. We discuss the following data difficulty factors which deteriorate classification performance: decomposition of the minority class into rare sub-concepts, overlapping of classes and distinguishing different types of examples. New experimental studies showing the influence of these factors on classifiers are presented. The paper also includes critical discussions of methods for their identification in real world data. Finally, open research issues are stated.

Pages (from - to)

333 - 363

DOI

10.1007/978-3-319-18781-5_17

URL

https://link.springer.com/chapter/10.1007/978-3-319-18781-5_17

Book

Challenges in Computational Statistics and Data Mining

This website uses cookies to remember the authenticated session of the user. For more information, read about Cookies and Privacy Policy.