Depending on the amount of data to process, file generation may take longer.

If it takes too long to generate, you can limit the data by, for example, reducing the range of years.

Chapter

Download BibTeX

Title

A Comparison of Two Approaches to Data Mining from Imbalanced Data

Authors

[ 1 ] Instytut Informatyki (II), Wydział Informatyki i Zarządzania, Politechnika Poznańska | [ P ] employee

Year of publication

2004

Chapter type

paper

Publication language

english

Abstract

EN Our objective is a comparison of two data mining approaches to dealing with imbalanced data sets. The first approach is based on saving the original rule set, induced by the LEM2 algorithm, and changing the rule strength for all rules for the smaller class (concept) during classification. In the second approach, rule induction was split: the rule set for the larger class was induced by LEM2, while the rule set for the smaller class was induced by EXPLORE, another data mining algorithm. Results of our experiments show that both approaches increase the sensitivity compared to the original LEM2. However, the difference in performance of both approaches is statistically insignificant. Thus the appropriate approach to dealing with imbalanced data sets should be selected individually for a specific data set.

Pages (from - to)

757 - 763

DOI

10.1007/978-3-540-30132-5_103

URL

https://link.springer.com/chapter/10.1007/978-3-540-30132-5_103

Book

Knowledge-Based Intelligent Information and Engineering Systems : 8th International Conference, KES 2004, Wellington, New Zealand, September 20-25, 2004, Proceedings, Part I

Presented on

8th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems KES 2004, 20-25.09.2004, Wellington, New Zealand

This website uses cookies to remember the authenticated session of the user. For more information, read about Cookies and Privacy Policy.