On reasoning about black-box UDFs by classifying their performance characteristics
[ 1 ] Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ 2 ] Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ SzD ] doktorant ze Szkoły Doktorskiej | [ P ] pracownik
2024
rozdział w monografii naukowej / referat
angielski
- data integration process
- user defined function
- time series
- time series similarity measure
- time series classification
EN User defined functions (UDFs) are frequent components of SQL queries and data processing workflows (DPWs). In both of these applications, UDFs are often available as black boxes, i.e., their semantics and performance characteristics are unknown (such functions are further called BBUDFs). This feature prevents from optimizing execution plans of queries and from optimizing the whole DPWs. Discovering the semantics of a BBUDF is often impossible due to high complexity of its code. On the contrary, discovering its performance model seems to be feasible with the support of machine learning. In this paper, we present a solution for classifying BBUDFs into performance classes. This way, if a performance class of a given BBUDF is known, it may allow to reason about some hidden features of the BBUDF. Our solution is supported by experimental evaluation, which reveals that our initial approach, in multiple cases, allows to classify BBUDFs to adequate performance classes.
20
140