Capítulo de livro Produção Nacional Revisado por pares

Performance Tests in Data Warehousing ETLM Process for Detection of Changes in Data Origin

2003; Springer Science+Business Media; Linguagem: Inglês

10.1007/978-3-540-45228-7_14

ISSN

1611-3349

Autores

Rosilene Fernandes da Rocha, Leonardo Figueiredo Cardoso, Jano Moreira de Souza,

Tópico(s)

Advanced Data Storage Technologies

Resumo

In a data warehouse (DW) environment, when the operational environment does not posses or does not want to inform the data about the changes that occurred, controls have to be implemented to enable detection of these changes and to reflect them in the DW environment. The main scenarios are: i) the impossibility to instrument the DBMS (triggers, transaction log, stored procedures, replication, materialized views, old and new versions of data, etc) due to security policies, data property or performance issues; ii) the lack of instrumentation resources on the DBMS; iii) the use of legacy technologies such file systems or semi-structured data; iv) application proprietary databases and ERP systems. In another article [1], we presented the development and implementation of a technique that was derived for the comparison of database snapshots, where we use signatures to mark and detect changes. The technique is simple and can be applied to all four scenarios above. To prove the efficiency of our technique, in this article we do comparative performance tests between these approaches. We performed two benchmarks: the first one using synthetic data and the second one using the real data from a case study in the data warehouse project developed for Rio Sul Airlines, a regional aviation company belonging to the Brazil-based Varig group. We also describe the main approaches to solve the detection of changes in data origin.

Referência(s)