Building a Scalable Data Pipeline to Integrate Oracle Fusion Cloud Data with Snowflake

Building a Scalable Data Pipeline to Integrate Oracle Fusion Cloud Data with Snowflake

AI and large language models might be dominating the spotlight, but ERP data remains the backbone for running daily operations and making critical business decisions. This is especially true for fast-growing companies that haven’t had the time or resources to craft a solid data strategy. Getting the ERP data strategy right from the start isn’t just a nice-to-have; it’s a must-have.

Oracle Fusion Cloud is one of the popular ERP systems for companies aiming to streamline their operations. Despite Oracle Fusion Cloud's maturity, integrating its data into modern cloud platforms like Snowflake can be a complex and frustrating process. After stepping into my new role as Head of Data in a fast-growing medical device company, the first challenge I needed to solve was integrating Oracle Fusion data into Snowflake. After exploring various options, we identified a reliable solution: Fivetran’s Oracle Fusion connector paired with its BI Publisher connector.

This article shares the various options we explored, and how this setup became the solution that resolved the integration challenges while addressing key pitfalls.

Challenges in Oracle Fusion Cloud Data Integration

Organizations typically evaluate several methods to extract and load Oracle Fusion Cloud data into a data lake or warehouse. However, most approaches come with significant drawbacks:

  1. Golden Gate Replication Oracle Golden Gate is a robust data replication tool but is not supported for Oracle Fusion Cloud data extraction. This option is a non-starter.
  2. Custom Code with BICC (Business Intelligence Cloud Connector) Writing custom code to work with BICC is complex and time-consuming. It demands a deep understanding of Oracle Fusion schemas, UCM manifest files and merging multiple output data files, making it impractical for teams with limited resources.
  3. BI Publisher (BIP) A common workaround is using BI Publisher to generate reports, saving them in shared folders, and moving them to a data platform using ETL/ELT tools. However, this approach is far from ideal due to: Dependency on your company's Oracle teams: Every new data requirement or schema change requires your Oracle team's involvement, slowing down the process as they will have other priorities. Complexity and Scalability Issues: Managing hundreds of table-specific reports or reports with multi-table report SQL and complex logic becomes unmanageable, especially as the data needs grow.

A Modern Solution: Fivetran Oracle Fusion Connector

Fivetran’s Oracle Fusion Cloud connector, launched in early 2024, offers a scalable, automated Extract load and Transform (ELT) way to integrate Oracle Fusion data into modern data platforms like Snowflake. By leveraging Oracle’s BICC, this connector simplifies data extraction, ensuring incremental updates and seamless schema mapping.


Using Fivetran Connectors to load from Oracle Fusion to Snowflake


Key Features of Fivetran’s Oracle Fusion Connector

  • BICC Integration Tailored Data Extraction: Extracts data directly from Oracle’s Public View Objects (PVOs), covering transactional and master data for financials, HR, supply chain, and more.
  • Incremental Updates: Synchronizes only new or updated records, reducing latency and data transfer times.
  • Primary Key Validation: Tracks deletions and ensures data integrity.
  • Oracle Universal Content Management (UCM) Integration Secure Data Staging: Data extracted via BICC is staged in UCM as CSV files.
  • Automated Retrieval: Fivetran fetches files using secure API calls, tagging metadata for accuracy.
  • Efficient Chunking: Large datasets are split into manageable chunks to optimize performance.
  • Data Validation and Schema Mapping Ensures data integrity by validating row counts and column structures. Automatically updates schemas in Snowflake to handle changes, such as added columns, without disruption.
  • Monitoring and Automation Real-Time Dashboards: Provides visibility into sync status, data volumes, and errors.
  • Self-Healing Pipelines: Retries failed operations automatically and alerts only for critical issues.

Addressing Common Challenges

While Fivetran’s Oracle Fusion Connector simplifies integration, certain challenges remain. Here’s how to mitigate them:

Incomplete Table and Column Coverage

Oracle’s BICC doesn’t expose all necessary tables and columns. To fill these gaps, use Fivetran’s BI Publisher connector selectively for missing data. Although slower for large volumes, this ensures comprehensive coverage. You can then join the two outputs using the primary keys and make the dataset whole in Snowflake.

Performance Considerations

Avoid enabling all PVOs during initial setup. Start with essential tables, then scale gradually based on requirements. Overloading the connector can degrade performance.

Cost Management

Fivetran’s pricing model is based on monthly active rows, so costs can add up if unnecessary tables are synced. Limit active tables in development environments and enable syncs only for production-critical data.

Conclusion

Fivetran’s Oracle Fusion Cloud connector, combined with its newer BI Publisher connector for supplemental data, provides a scalable solution for integrating Oracle Fusion Cloud data into cloud data lakes like Snowflake. While there are considerations around performance and cost, the approach significantly reduces complexity and enables pipeline automation that drives organizations to focus on deriving insights from their ERP data.

#dataengineering #oracle #oraclefusioncloud #snowflake #fivetran

This is a valuable topic for many tackling integration challenges. Do you have any specific strategies in mind that have worked well?

回复
Komal Sharma (She/Her)

Business Intelligence Program Manager at Keysight Technologies

2 个月

Bharath Natarajan on the limitation of BICC not having all the fields and then bringing them via BIP and joining both in snowflake. What challenges do you see in terms of data or performance.

回复
Paul Dudley

Co-Founder @ Streamkap

2 个月

Bharath Natarajan have you looked at Snowpipe Streaming for loading data into Snowflake? Great way to scale, get lower latency and lower costs!

回复

Another company called incorta ( started by ex oracle People) have an equivalent or better solution Many consulting firms like ours have integration framework with oracle fusion that provides the same functionality and it’s cheaper than fivetran

Steven Melendez

Sr. Data Analyst | Sr. IT Business Analyst

2 个月

Excellent post Bharath!

回复

要查看或添加评论,请登录

Bharath Natarajan的更多文章

社区洞察

其他会员也浏览了