Granular approximations: A novel statistical learning approach for handling data inconsistency with respect to a fuzzy relation
[ 1 ] Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ P ] employee
2023
scientific article
english
- Inconsistencies in data
- Fuzzy logic
- Statistical learning
- Rough sets
EN Inconsistency in classification and regression problems occurs when instances that relate in a certain way on the condition attributes, do not follow the same relation on the decision attribute. It typically appears as a result of perturbation in data caused by incomplete knowledge (missing attributes) or by random effects that occur during data generation (instability in the assessment of decision attribute values). Inconsistencies with respect to a crisp preorder relation (expressing either dominance or indiscernibility between instances) can be handled with set-theoretic approaches like rough sets and by using statistical/machine learning approaches that involve optimization methods. In particular, the Kotłowski-Słowiński (KS) approach relabels the objects from a dataset such that inconsistencies are removed, and such that the new class labels are as close as possible to the original ones in terms of a given loss function. In this paper, we generalize the KS approach to handle inconsistency determined by a fuzzy preorder relation rather than a crisp one. The method produces a consistent fuzzy relabeling of the instances and may be used as a preprocessing tool with algorithms for binary classification and regression. As the obtained fuzzy sets can be represented as unions of meaningful simple fuzzy sets or granules, we call them granular approximations. We provide statistical foundations for our method, develop appropriate optimization procedures, provide didactic examples, and prove several important properties.
01.02.2023
249 - 275
200