On Customer Data Deduplication: Lessons Learned from a R&amp;D Project in the Financial Sector

Paweł Boiński; Mariusz Sienkiewicz; Bartosz Bębel; Robert Wrembel; Dariusz Gałęzowski; Waldemar Graniszewski

System Informacji Naukowej Politechniki Poznańskiej

PL EN

Strona główna / Publikacje / On Customer Data Deduplication: Lessons Learned from a R&D Project in the Financial Sector

Zgłoś uwagę

Rozdział

Pobierz BibTeX

Tytuł

On Customer Data Deduplication: Lessons Learned from a R&D Project in the Financial Sector

Autorzy

Paweł Boiński (WIiT) ^{[ 1 ][ 2.3 ][ P ]}
Mariusz Sienkiewicz (WIiT) ^{[ 2 ][ 2.3 ][ DW ]}
Bartosz Bębel (WIiT) ^{[ 1 ][ P ]}
Robert Wrembel (WIiT) ^{[ 1 ][ 2.3 ][ P ]}
Dariusz Gałęzowski
Waldemar Graniszewski

^{[ 1 ]} Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | ^{[ 2 ]} Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | ^{[ P ]} pracownik | ^{[ DW ]} doktorant wdrożeniowy

Dyscyplina naukowa (Ustawa 2.0)

[2.3] Informatyka techniczna i telekomunikacja

Rok publikacji

2022

Typ rozdziału

rozdział w monografii naukowej / referat

Język publikacji

angielski

Słowa kluczowe

EN

data quality
data cleaning
data deduplication pipeline

Streszczenie

EN Despite the fact that financial institutions (FIs) apply data governance strategies and use the most advanced state-of-the-art data management and data engineering software and systems to support their day-to-day businesses, their databases are not free from some faulty data (dirty and duplicated). In this paper, we report some conclusions from an ongoing research and development project for a FI. The goal of this project is to integrate customers’ data from multiple data sources - clean, homogenize, and deduplicate them. This paper, in particular, focuses on findings from developing customers’ data deduplication process.

URL

http://ceur-ws.org/Vol-3135/darliap_paper6.pdf

Książka

Proceedings of the Workshops of the EDBT/ICDT 2022 Joint Conference, Edinburgh, UK, March 29, 2022

Zaprezentowany na

Workshops of the EDBT/ICDT 2022 Joint Conference, 29.03.2022, Edinburgh, United Kingdom

Typ licencji

CC BY (uznanie autorstwa)

Tryb otwartego dostępu

witryna wydawcy

Wersja tekstu w otwartym dostępie

ostateczna wersja opublikowana