Depending on the amount of data to process, file generation may take longer.

If it takes too long to generate, you can limit the data by, for example, reducing the range of years.

Chapter

Download BibTeX

Title

Rule Discovery for (Semi-)automatic Repairs of ETL Processes

Authors

[ 1 ] Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ P ] employee

Scientific discipline (Law 2.0)

[2.3] Information and communication technology

Year of publication

2020

Chapter type

chapter in monograph / paper

Publication language

english

Keywords
EN
  • data source evolution
  • ETL process repair
  • Case-Based-Reasoning
  • rule discovery from cases
Abstract

EN A data source integration layer, commonly called extract-transform-load (ETL), is one of the core components of information systems. It is applicable to standard data warehouse (DW) architectures as well as to data lake (DL) architectures. The ETL layer runs processes that ingest, transform, integrate, and upload data into a DW or DL. The ETL layer is not static, since the data sources being integrated by this layer change their structures. As a consequence, an already deployed ETL process stops working and needs to be re-designed (repaired). Companies typically have deployed from thousands to hundreds of thousands of ETL processes. For this reason, a technique and software support for repairing semi-automatically a failed ETL processes is of vital practical importance. This problem has been only partially solved by technology or research, but the solutions still require an immense work of an ETL administrator. Our solution is based on a case-based-reasoning combined with repair rules. In this paper, we contribute a method for automatic discovery of repair rules from a stored history of repair cases.

Date of online publication

12.08.2020

Pages (from - to)

250 - 264

DOI

10.1007/978-3-030-57672-1_19

URL

https://link.springer.com/chapter/10.1007/978-3-030-57672-1_19

Book

Databases and Information Systems : 14th International Baltic Conference, DB&IS 2020, Tallinn, Estonia, June 16–19, 2020 : Proceedings

Presented on

14th International Baltic Conference on Databases and Information Systems DB&IS 2020, 16-19.06.2020, Tallin, Estonia

Ministry points / chapter

20

Ministry points / conference (CORE)

70

This website uses cookies to remember the authenticated session of the user. For more information, read about Cookies and Privacy Policy.