Striving towards Near Real-Time Data Integration for Data Warehouses
2002; Springer Science+Business Media; Linguagem: Inglês
10.1007/3-540-46145-0_31
ISSN1611-3349
AutoresRobert M. Bruckner, Beate List, Josef Schiefer,
Tópico(s)Data Management and Algorithms
ResumoThe amount of information available to large-scale enterprises is growing rapidly. While operational systems are designed to meet well-specified (short) response time requirements, the focus of data warehouses is generally the strategic analysis of business data integrated from heterogeneous source systems. The decision making process in traditional data warehouse environments is often delayed because data cannot be propagated from the source system to the data warehouse in time. A real-time data warehouse aims at decreasing the time it takes to make business decisions and tries to attain zero latency between the cause and effect of a business decision. In this paper we present an architecture of an ETL environment for real-time data warehouses, which supports a continual near real-time data propagation. The architecture takes full advantage of existing J2EE (Java 2 Platform, Enterprise Edition) technology and enables the implementation of a distributed, scalable, near real-time ETL environment. Instead of using vendor proprietary ETL (extraction, transformation, loading) solutions, which are often hard to scale and often do not support an optimization of allocated time frames for data extracts, we propose in our approach ETLets (spoken "et-lets") and Enterprise Java Beans (EJB) for the ETL processing tasks.
Referência(s)