Hypergraph-based importance assessment for binary classification data
[ 1 ] Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ P ] pracownik | [ S ] student
2023
artykuł naukowy
angielski
- hypergraphs
- machine learning
- imbalanced data
- random undersampling
- feature selection
EN We present a novel hypergraph-based framework enabling an assessment of the importance of binary classification data elements. Specifically, we apply the hypergraph model to rate data samples’ and categorical feature values’ relevance to classification labels. The proposed Hypergraph-based Importance ratings are theoretically grounded on the hypergraph cut conductance minimization concept. As a result of using hypergraph representation, which is a lossless representation from the perspective of higher-order relationships in data, our approach allows for more precise exploitation of the information on feature and sample coincidences. The solution was tested using two scenarios: undersampling for imbalanced classification data and feature selection. The experimentation results have proven the good quality of the new approach when compared with other state-of-the-art and baseline methods for both scenarios measured using the average precision evaluation metric.
25.12.2022
1657 - 1683
CC BY (uznanie autorstwa)
czasopismo hybrydowe
ostateczna wersja opublikowana
przed opublikowaniem
100
2,5