Using relational algebra on the specification of real world ETL processes

By Santos, V.; Belo, O.

Proceedings - 15th IEEE International Conference on Computer and Information Technology, CIT 2015, 14th IEEE International Conference on Ubiquitous Computing and Communications, IUCC 2015, 13th IEEE International Co



Modeling Extract-Transform-Load (ETL) processes of a Data Warehousing System has always been a challenge. The heterogeneity of the sources, the quality of the data obtained and the conciliation process are some of the issues that must be addressed in the design phase of this critical component. Commercial ETL tools often provide proprietary diagrammatic components and modeling languages that are not standard, thus not providing the ideal separation between a modeling platform and an execution platform. This separation in conjunction with the use of standard notations and languages is critical in a system that tends to evolve through time and which cannot be undermined by a normally expensive tool that becomes an unsatisfactory component. In this paper we demonstrate the application of Relational Algebra as a modeling language of an ETL system as an effort to standardize operations and provide a basis for uncommon ETL execution platforms.



Google Scholar: