Towards a Cost Model to Optimize User-Defined Functions in an ETL Workflow Based on User-Defined Performance Metrics
[ 1 ] Wydział Informatyki, Politechnika Poznańska | [ 2 ] Instytut Informatyki, Wydział Informatyki, Politechnika Poznańska | [ P ] employee
2019
chapter in monograph / paper
english
- ETL workflow
- ETL execution optimization
- user-defined functions
- cost model
- parallelization
EN Today’s ETL tools provide capabilities for developing custom code as user-defined functions (UDFs) to extend the expressiveness of standard ETL operators. However, a custom code of an UDF may execute inefficiently due to its poor implementation (e.g., due to the lack of using parallel processing or adequate data structures). In this paper we address the problem of the optimization of UDFs in data-intensive workflows and presented our approach to construct a cost model to determine the degree of parallelism for parallelizable UDFs.
13.08.2019
441 - 456
20
70