登录查看更多内容

Strategy to Migrate Large Oracle Database from On-premise to Autonomous Database

Muhammad Shuja (Trusted Advisor)

Solution Architect - Azure @Cloocus | ex-Oracle | ex-IBM | Data Management Niche, Cloud-IaaS-PaaS, ML/AI | Ex-PKOUG President

发布日期: 2022年11月23日

In this article, you will learn a smart strategy and quick tips and techniques, to migrate large Oracle databases from on-premise to Cloud, specially to Oracle Autonomous Database. I will explain about migrating to ATP (Autonomous Transaction Processing) database, but this could equally be used for ADW (Autonomous Datawarehouse), with few modifications. Very useful for anyone experienced in Oracle Database and not so much in Cloud or OCI (Oracle Cloud Infrastructure - Oracle Cloud). ?

Before we start, let me quickly explain few terminologies here. Starting with, ADB (Autonomous Database ) is like an umbrella product term, that's Oracle's flagship database offering now, with machine-learning–driven automated tuning, scaling, and patching, Autonomous Database delivers the highest performance, availability, and security. Built on the industry-proven and popular architecture of Oracle Exadata and completely Oracle Managed in the cloud, ADB comes in various flavors, eg ATP (for OLTP/transaction apps) or ADW (for analytics/DWS apps). There are other variants of ADB as well, eg AJD (Autononous JSON Database), for JSON centric applications and APEX for Oracle APEX only application use. All of these variants comes under another overarching category: Dedicated (for large applications, which require large compute power and dedicated nodes) and Shared (for small apps which can run on say 1-16 OCPU eg). I will be using these terms, ATP, ADW and ADB accordingly.

Let's begin, suppose you have a large Oracle database to move from on-premise, say ~1TB. And you've recently upgraded it to 19c, it is still running in non-CDB mode (ie there’s no CDB or PDB). And additionally,?you have normal internet connectivity to the cloud. With all these constraints, you need to have an appropriate migration strategy in place, so that your data movement and extraction and loading or conversion time does not goes for hours and hours and so does the downtime during migration. With this strategy, it will also help convert the database from non-CDB(Container) database to CDB/PDB(Pluggable) database in the same go. However, you can use this strategy, for CDB to ATP migration as well.

As a seasoned Oracle DBA, you know Oracle has been always feature-rich in terms of tools and functionalities, there has been no dirt of options available to move data from one Oracle instance to another. There are a many new tools available now as well for migration, eg ZDM (Zero Downtime Migration) , DMS (Database Migration Service) or MV2ADB , that provides from physical to logical migration or Online vs offline migration and on-premise to Cloud migration. Both ZDM and DMS can be used to perform migration online, without virtually no downtime, but that requires bit more configuration setup using Goldengate replication feature. Here I'm not going to use any of those tools, but your familiar Import/Export (Data Pump) utility. Let me first share the environment details:

Environment:

SourceDB: Oracle Database EE 19c (non-CDB, non-RAC, Encrypted) on Oracle Enterprise Linux on-premise

TargetDB: Oracle Autonomous Transaction Processing (CDB/PDB) in OCI

Migration Tool: Data Pump (Impdp/Expdp)

Network: IPSec/VPN (optional)

Strategy Summary:

Here's a high level summary of the approach. First, you will export specific schemas with compression into local DB server. Then, take those big compressed dump files and use multi-part upload feature of OCI, to upload it into the Object store. Then, you’ll use Import from another OCI Client instance, connected to ATP, to load the data from Object Storage into ATP.

How would you do it:

We are talking about moving data over the internet (without FastConnect), so, we need to keep the dump file size (data for movement) as low as possible, so that upload is fastest. One way of doing it, is to only include application related objects and not the whole database export. Export as a System user and specify only specific application/data schemas you want to export, thus excluding all sys or system objects, which are anyways discarded by Autonomous (ATP) instance. Because ADB in order to provide high performance and security, doesn't allow access to sys schema objects. Similarly,?exclude all such objects not supported by ADB eg db_links, clusters, indextype, etc. That also helps reduced dump file size. There’s a new way to create Dblinks in ATP, if you still want to use it. For ADW, you’ll need to exclude additional objects, as they’re managed by ADW itself, eg index, indextype, materialized_view, materialized_view_log, materialized_zonemap, etc.

Additionally, I’d recommend you to download and use Cloud Premigration Advisor (CPAT) tool, which is very useful to run on your source database data before migration. You can specify the target database as ATP or DBCS and the generated Report will tell you, if there would be some issues/warnings for migrating certain objects or data-types and provides workaround to avoid those errors, before even running migration.

Export Data using Data Pump Explained:

Let me explain Export options, which ones to use and why. First is compression, that is very useful where network bandwidth is low and latency is relatively high (as in case of internet). However, set compression as medium (not high), so as not to burden too much on the source database resources (specially CPU). Make sure, you’re also leveraging on the parallelism capability of Data Pump and set it to roughly 2 times of the CPU of the on-premise DB, along with allowing multiple-dump file creation (through % sign), for parallel write. You may want to first run ‘expdp’ command with estimate clause to estimate the space requirements, as all files will be created in the server. You may also try first with few schemas, to assess time and performance impact. I’d also recommend you use parfile option, to keep all Export command options cleanly in a parfile. I’ve also enabled metrics, that is useful for planning similar export again later. And we are also going to encrypt our dump file with the provided password. Do make sure, you’ve included all related schemas, otherwise, you’ll get errors during import, for missing dependent schema/objects. You will generate multiple files (roughly 16) of 65gb each, based on size of your schema data is around 1TB. You should change filesize parameter according to your size of the data. You need to have multiple files here, just because you can take advantage of parallel writes. Otherwise, break-up into smaller pieces is done at the uploading stage (OCI), later steps. Here’s the sample export command for your reference:

expdp userid=system@dbcs parfile=exp19c.par

Parfile: expdb19c.par

directory=data_pump_dir

schemas=abc,xyz,stu,pqr,mno

logfile=expdb19c.log

exclude=cluster,indextype,db_link

exclude=statistics

logtime=all

METRICS=YES

parallel=16

dumpfile=dumpfile%U.dmp

filesize=65gb

compression=ALL

compression_algorithm=MEDIUM

encryption_pwd_prompt=yes

Also note that, you’ll need to create other objects (in-advance in ATP), which are not in the schema and your application data uses them, eg Roles, Public Synonyms, etc. Because, schema-level export excludes that, so you will need to create those objects (manually), before running Import job.?You can write simple script from source DB for creating such objects in ATP. Depending on your size of the database data, you will have that many number of files divided into 100gb files. Do remember this is logical migration, so it doesn’t matter if the database is CDB or non-CDB. In our case it’s non-CDB datasae. Additionally, if the version was 21c, as part of new feature, we could directly export dump files into OCI Object Storage (instead of local storage).

Using OCI CLI for Multi-part Upload:

Once you’ve your export dump files generated in the local filesystem of the source DB, next, you will need to install OCI CLI client on your DB instance. Using OCI CLI Client, you can use the multi-part upload feature of OCI Object storage. This not only provides parallelism to upload maximizing on the internet bandwidth, but also allows pause and resume capability, which is very useful when you are uploading large files and loose connection, so you can resume upload. You can follow the link to download/install OCI CLI client or run following command from DB Linux shell:

bash -c "$(curl -L https://raw.githubusercontent.com/oracle/oci-cli/master/scripts/install/install.sh)"

Once installed, you can verify it with oci -v command. Here’s an easy sample command for muliti-part upload for your reference. This you need to run for every (16) dump file generated in the export step above. Or you can equaly write a For loop script to pick all files from object storage one by one and break into smaller peices and upload. Here’s syntax to upload one file:

oci os object put \

Javid Ur Rahaman 4 个月前

Oracle Database License Optimization

Javid Ur R. 1 个月前

OCI On-Prem to Cloud Database Migration | DMS - Online…

Al Mamun 2 个月前

--namespace MyNamespace \

--bucket-name MyBucket \

--file /home/export_dump/dumpfile01.dmp \

--name MyObject \

--parallel-upload-count 10 \

--part-size 50

?Setup VM with Oracle Client:

Next, you will need to create a VM instance and install Oracle Database Client to run Data Pump Import command, since direct access to Autonomous Database (shell/root) is not allowed, due to Autonomous security features. There are other ways to execute Data Pump in ATP, eg solely from PL/SQL code or SQL Web browser, etc. But we will use command line option of Data Pump, which you are always familiar with. You can choose Oracle Linux 8 instance to install Oracle Instant Client and make connectivity to the ATP database. Follow this link to download/install or run following command from Linux shell:?

sudo dnf install oracle-instantclient-release-el8

sudo dnf install oracle-instantclient-sqlplus

wget https://yum.oracle.com/repo/OracleLinux/OL8/oracle/instantclient21/x86_64/getPackage/oracle-instantclient-tools-21.7.0.0.0-1.el8.x86_64.rpm

sudo dnf install oracle-instantclient-tools-21.7.0.0.0-1.el8.x86_64.rpm

You can follow this blog post to learn ways to connect to ATP database, which is slightly different than regular Oracle database connections, because as part of Autonomous security features, it requires you to connect only through secure encrypted SSL connection using public-key private-key concept. Since, Data Pump runs on the server (not client), you’ll need to have directory created and credentials set, for authentication with object store (can refer here ), so that export is actually copied into the Object Store instead.

Import Data into ATP Explained:

Once you’ve tested the database connectivity working, you need to create Import command, that’s best suited for ATP database and provides maximum performance. Again, create a parfile with following parameters, eg excluding encryption method from source DB, since ADB encrypts all objects by default as part of Autonomous features. You also need to let index re-created with ATP default feature. Since, you are not allowed to create your own tablespaces in ADB, you need to remap all tablespaces to ‘DATA’ tablespace in ATP. Again because, ATP manages all tablespaces and underlying resources itself using ML(Machine Learning). Don’t forget to use parallelism feature here too, to 1.5 times (or same as 1) the number of OCPU allocated to your ATP instance. As you’re aware of the benefits of ATP, you can simply scale up the instance to?a higher number of OCPU, just before starting the import and once import is finished, you can return back the OCPU to the original value (without any downtime) and just pay for those additional minutes during import. You may also want to disable archive logging during the import operation (since it’s your fresh target DB), to improve performance and reduce data loading time further in ATP. Here is a sample import command for your reference:

impdp admin@adb_high parfile=impatp.par

Parfile: impatp.par

directory=data_pump_dir

credential=my_new_cred

dumpfile=https://objectstorage.ap-sydney-1.oraclecloud.com/n/xxxxxxxxxxx/b/xxxxxx/dumpfile%U.dmp

parallel=8

transform=dwcs_cvt_iots:y

transform=constraint_use_default_index:y

transform=segment_attributes:n

transform=omit_encryption_clause:y

remap_tablespace=%:DATA

exclude=cluster,indextype,db_link

encryption_pwd_prompt=yes

?You can get the url of the dump file, from going into Object Storage and open details view of any object uploaded and copy the URL path (URI) and then add the dumpfile name at the end.

Conclusion

So, this is one approach to migrate data from on-premises Oracle Database to Oracle Cloud in Autonomous Transaction Process database, over regular internet, using your familiar Data Pump tool, tweaking parameters appropriately, to get best performance and speed to move big size databases. If you are looking for even lesser/no-downtime option than this, you can look into other options like ZDM or DMS Online migration, as mentioned above.

Here are some useful links to relevant documents:

How to connect to Oracle Autonomous Cloud Databases

Get Started with Autonomous Database

Cloud Premigration Advisor Tool (CPAT) Analyzes Databases for Suitability of Cloud Migration (Doc ID 2758371.1)

Doc: Import Data Using Oracle Data Pump on Autonomous Database

Blog: Loading data into Autonomous Data Warehouse using Datapump

Kubanych Dzhakypakunov

Seasoned Oracle DBA with Expertise in Financial Systems | Pivoting to Data Science | Innovator in Telemedicine Solutions | Ready for Opportunities in Astana

1 个月

Very informative.

1 次回应

Mahendran Varadharajan

Cloud Architect / Oracle Apps DBA

1 年

Dear my respected connect, Senior Oracle Apps DBA / Oracle DBA 16+ yrs exp. in UAE. Available immediately. Please download my resume https://www.slideshare.net/MahendranVaradharaja/oracle-apps-dba-oracle-dba-resume #oracledatabase #oraclecloud #oracledba #oracleebs #applicationsupport #applications