Oracle Data Integration - Myths & facts for building winning strategy
This is an era of “information”, which is based on digital data. Hence, many tools have been developed and become popular for data acquisition. These are popular with different names like ETL, ELT, DI, Data Flow etc. Post evolution of “platform as a service”, top cloud vendors are offering age-old data integration technology with new mix and marketing strategy. They are offering it as something new technology to the customers. They are highlighting it as server-less, multi-pipeline parallel engine, in-memory data flow processing, drag & drop workflows etc., however in actual sense, they are only highlighting few of the capability/features of the earlier tools.
Oracle also has its data integration product in its portfolio from almost last ten years & more, many features that are highlighted now a days are already available from long time. I am not going to highlight them in this article (I may come up with another one), but over the period, we have not been able to position it as “the data integration” platform successfully. Although we are selling, it as part of Oracle database opportunity by calling it as ELT for Oracle database but ODI is far more than that. There are few myths that are stopping us to call it as an independent ELT/ETL tool.
Let’s talk about few of those Myths & Facts about Oracle Data Integration tool:
Myth 1: ODI need Oracle database exclusively for its installation,
Fact: You can install ODI in standalone server with very small footprint of MySQL database without charging extra money. Not only that, you can install it using MS-SQL Server and IBM DB2 as well. The basic concept of ODI is to use this database for repository purpose only where you can have master repository and work repository. Work repository can be used as development, test and production repository hence, it could be multiple. Master repository need 200 MB approx. space and work repositories need 400 MB space, as per the requirement. So, decide, if you really need multiple environments to host the ODI in different environments? if customer is not adamant to have multiple instances, you can use single server for multiple environments.
Myth 2: ODI need only Oracle DB as Target Database
Fact: There are various other database are certified as target database i.e. Oracle DB, MySQL, MS-SQL, Apache Hive, Apache Derby, Netezza, Teradata, Ingres, PostGreSQL, MS-SQL Server, IBM DB2, SAP ERP (with ABAP, iDOC), Flat files etc. Most of them by using standard JDBC compliant drivers.
Myth 3: ODI is ELT tool therefore, it need target database for installation and staging use.
FACT: It depends on case-to-case basis, All it requires, repository and a staging database. We have already discussed about repository database, now for staging database, ODI can takes source database also as staging but usually we don’t use source database for staging due to obvious reasons of performance impact. Let’s take few scenarios:
- You have to build a small Data warehouse instance where data will be pulled from multiple applications and all using ODI only and you require extensive use of ETL/ELT, and data set is also huge, let’s say almost 70-90% requirement of ETL. In that situation, it is wise decision to use ODI on target database server as staging area.
- You need to build a source integration from few of the applications and load them in Data warehouse where data is may be coming from multiple source and directly loading into Database using other methods. Therefore, your requirement to process data is less than 20-30% or having smaller data sets of overall data processing. In such situation, you can think of making separate staging area and use it as ETL.
- If your requirement of using ETL/ELT as 50% of total processing, you can trade off with the cost involve for full DB licenses vs half DB licenses w.r.t. to performance and processing volume.
Note: As per Oracle recommendations, ELT is faster than ETL, however you need to choose the architecture as per the requirement and competition.
Myth 4: ODI for Big Data is different product than ODI for Database.
Fact: So, far you must have learnt that ODI can be installed anywhere with not necessarily Oracle database as it’s repository i.e. even MySQL can be used for its repository. So, same ODI, which is use for database, can also be installed on top of Big Data Cluster with MySQL as its repository and use HIVE or SPARK as staging platform. The difference is only licensing, as it is different for database and Hadoop. However, in e-delivery portal, you may find the name with different description but common build. If you try to download both
of them together, it will display a message like “already selected …..”.
Image: ODI for Database
Image: ODI for Bigdata
Myth 5: ODI licensing is based on target database.
Fact: ODI licensing is depend on underlying database where the data transformation is executing, only those processor will be count for calculation. Mapping can be executed in source, staging or target. Example, if you don’t have target database i.e. target is file datastore, then you can use source or staging database. So, you will have to count the core of the database where transformation is executing. As mentioned in technology price list:
“For the purposes of the following program: Data Integrator Enterprise Edition, Data Integrator Enterprise Edition for Oracle Applications, and Application Adapters for Data Integrations, the users that are running or accessing the data transformation processes must be counted for the purposes of determining the number of licenses required.”
By default, target database is considered for staging database, therefore we calculate the total cores of that database. But like I mentioned earlier, if you think, your staging can be much lesser than target DB, then it is better to use standalone ODI using HUB & Spoke architecture of #2 of Myth 3 i.e. separate staging database. This will save lot of cost and make you competitive in case competitor throws heavy discounts. More details can be found in below link (VPN required):
https://aseng-wiki.us.oracle.com/asengwiki/display/ASPMODI/Data+Integration+Pricing+and+License+FAQ
Допомагаю оптим?зувати хмарн? витрати AWS, Azure, Google Cloud
1 年Kavindra Singh hi, can you please update the last link in this article?
Solution Consultant Analytics
4 年Good read...