A process mining approach for discovering ETL black points

By Belo, O.; Dias, N.; Ferreira, C.; Pinto, F.

Advances in Intelligent Systems and Computing

2017

Abstract

ETL tasks are quite complex often leading to a very complex network of working processes. Many difficulties of their development come from the number of sources of information we need to work, the heterogeneity and dispersion of data, and from the complexity of the tasks to implement, in order to populate appropriately a data warehouse. Thus, it is not difficult to occur some undesirable situations related to ETL system design errors or to the implementation of faulty or inefficient tasks. Many of these situations are only detectable at run time. In this paper, we discuss in particular the case of ETL bottleneck situations - ETL black points -, which can occur during the execution of an ETL system, identifying them and characterizing them using process mining. Based on the process mining results analysis, it is possible to develop alternative implementations for inefficient tasks and improve the overall system performance.

ALGORITMI Members

RepositoriUM:

Google Scholar: