Depending on the amount of data to process, file generation may take longer.

If it takes too long to generate, you can limit the data by, for example, reducing the range of years.

Article

Download BibTeX

Title

GMMSampling: a new model-based, data difficulty-driven resampling method for multi-class imbalanced data

Authors

[ 1 ] Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ P ] employee

Scientific discipline (Law 2.0)

[2.3] Information and communication technology

Year of publication

2023

Published in

Machine Learning

Journal year: 2023 | Journal volume: in press

Article type

scientific article

Publication language

english

Keywords
EN
  • Imbalanced data
  • Multi-class classification
  • Resampling methods
  • Data difficulty factors
  • Gaussian mixture model
Abstract

EN Learning from multi-class imbalanced data has still received limited research attention. Most of the proposed methods focus on the global class imbalance ratio only. In contrast, experimental studies demonstrated that the imbalance ratio itself is not the main difficulty in the imbalanced learning. It is the combination of the imbalance ratio with other data difficulty factors, such as class overlapping or minority class decomposition into various subconcepts, that significantly affects the classification performance. This paper presents GMMSampling—a new resampling method that exploits information about data difficulty factors to clear class overlapping regions from majority class instances and to simultaneously oversample each subconcept of the minority class. The experimental evaluation demonstrated that the proposed method achieves better results in terms of G-mean, balanced accuracy, macro-AP, MCC and F-score than other related methods.

Date of online publication

20.11.2023

DOI

10.1007/s10994-023-06416-8

URL

https://link.springer.com/article/10.1007/s10994-023-06416-8

License type

CC BY (attribution alone)

Open Access Mode

czasopismo hybrydowe

Open Access Text Version

final published version

Date of Open Access to the publication

in press

Ministry points / journal

140

Impact Factor

7,5 [List 2022]

This website uses cookies to remember the authenticated session of the user. For more information, read about Cookies and Privacy Policy.