Best Practices for Real-Time Data Warehousing
Christophe D.
Hybrid cloud enabler empowering a data-driven world with optimized, secure and scalable solutions.
Integrating data and applications throughout the enterprise, and presenting a consolidated view of them, is a complex proposition. Not only are there broad disparities in data structures and application functionality, but there are also fundamental differences in integration architectures.
Some integration needs are data oriented, especially those involving large data volumes. Other integration projects lend themselves to an event-oriented architecture for asynchronous or synchronous integration. Changes tracked by Change Data Capture constitute data events. The ability to track these events and process them regularly in batches or in real time is key to the success of an event-driven integration architecture.
I must acknowledge that there is no one-size-fits-all approach when it comes to Real-Time Data Warehousing, since much depends on the latency requirements, overall data volume as well as the daily change volume, load patterns on sources and targets, as well as structure and query requirements of the data warehouse.
Oracle Data Integrator (ODI) supports all approaches of loading a data warehouse. In practice there is one approach that satisfies the majority of real-time data warehousing use cases: The micro-batch approach using GoldenGate-based Change Data Capture with Oracle Data Integrator.
In this approach, one or more tables from operational databases are used as sources for GoldenGate Change Data Capture into a staging area database. This staging area provides a real-time copy of the transactional data for real-time reporting using Business Intelligence tools and dashboards. The operational sources are not additionally stressed as GoldenGate capture is non-invasive, still allows for high performance, and the separate staging area handles operational Business Intelligence queries without adding load to the transactions in the system.
Oracle Data Integrator performs a load of the changed records to the real-time data warehouse in frequent periods of 15 minutes or more. This pattern has demonstrated the best combination of providing fresh, actionable data to the data warehouse without introducing inconsistencies in aggregates calculated in the data warehouse.
Oracle Data Integrator provides rapid implementation and maintenance for all types of integration projects.