Masterclass on Oracle DataPump Utility [Part-III]: expdp/impdp

Masterclass on Oracle DataPump Utility [Part-III]: expdp/impdp

Overview

This article delves into the Oracle Data Pump utility, a powerful tool for exporting and importing data in Oracle databases. It is structured into four parts to provide a comprehensive understanding of this utility, from basic concepts to advanced techniques:


  • Part 1: Introduction, Concept, Pre-checks, and Mandatory Parameters – This section covers the fundamental aspects of Oracle Data Pump, including its purpose, essential concepts, pre-checks required before initiating Data Pump jobs, and a detailed overview of mandatory parameters.
  • Part 2: Expdp – Focuses on the expdp utility, detailing the syntax, options, use cases, and practical examples to help with various export scenarios.
  • Part 3: Impdp – Dives into the impdp utility, providing guidance on importing data, different options available, and handling common import challenges.
  • Part 4: Performance Tuning Techniques for Data Pump Jobs – Explores optimization techniques for speeding up Data Pump jobs, covering parallelism, tuning parameters, and strategies for handling large data volumes effectively.


By the end of this series, readers will gain a thorough understanding of the Oracle Data Pump utility, enabling them to manage data migrations and backups with confidence.


Link for Part I: https://www.dhirubhai.net/pulse/masterclass-oracle-datapump-utility-part-1-expdpimpdp-aman-pandey-rb7ac/?trackingId=g1wNGzVSQMW9C3B8NVXQfQ%3D%3D

Link for Part II: https://www.dhirubhai.net/pulse/masterclass-oracle-datapump-utility-part-ii-aman-pandey-ilguc/?trackingId=bVH%2FNG6WQUqNdR2OIg7t6w%3D%3D


Introduction to IMPDP

The impdp (Import Data Pump) utility is a robust Oracle tool used to import data and metadata from Data Pump dump files into an Oracle database. Serving as the companion to expdp (Export Data Pump), impdp is invaluable for database migrations, data restoration, and transferring data across environments.

Key Features of IMPDP:

  • Server-Side Processing: Runs directly on the Oracle server, which speeds up data transfer and reduces network load.
  • Parallel Processing: Supports parallelism to accelerate imports, allowing multiple worker processes to load data concurrently.
  • Object Remapping: Offers options like REMAP_SCHEMA and REMAP_TABLESPACE to map objects to different schemas or tablespaces during import.
  • Selective Importing: Parameters like INCLUDE, EXCLUDE, and QUERY allow for fine-grained control over the data and objects being imported.
  • Flexibility with Existing Objects: The TABLE_EXISTS_ACTION parameter provides options to handle pre-existing tables (e.g., skip, replace, append, or truncate).
  • Advanced Logging and Monitoring: Real-time job monitoring, logging, and the ability to pause, resume, or stop jobs make it easy to manage complex imports.


How IMPDP Works

The impdp (Import Data Pump) utility operates by importing data and metadata from dump files into an Oracle database. Here’s a step-by-step breakdown of its process:

  1. Job Initialization: When the impdp command is executed, Oracle Data Pump creates a master table in the user’s schema to manage and track the import job. This master table holds metadata about the job, such as the progress, status, and details of the objects being imported.
  2. Interaction with Data Pump API: impdp utilizes the Oracle Data Pump API, allowing it to communicate directly with the database server and manage the import operation efficiently. Server-side processing allows impdp to read from dump files directly on the server, speeding up the import by minimizing data transfer across the network.
  3. Worker Process Creation: Based on the PARALLEL parameter, multiple worker processes (known as slaves) are spawned to handle concurrent loading of data. Each worker process is responsible for specific tasks, such as reading data or loading objects, to accelerate the import.
  4. Loading Data and Metadata: impdp reads data and metadata from the specified dump files and reconstructs the database objects (tables, indexes, constraints, etc.) in the target database. During this phase, any specified parameters (e.g., REMAP_SCHEMA, REMAP_TABLESPACE, INCLUDE, EXCLUDE) are applied to control the scope and location of the imported data.
  5. Monitoring and Real-Time Management: Oracle allows monitoring and managing the import job in real-time. DBAs can use commands like STATUS, STOP_JOB, START_JOB, and KILL_JOB to view job progress, pause, resume, or terminate the job if needed. Information about active jobs can also be queried from views like DBA_DATAPUMP_JOBS and DBA_DATAPUMP_SESSIONS.
  6. Completion and Cleanup: Once the import completes, the master table is automatically dropped, removing job-related metadata from the user’s schema. impdp also generates a log file (if specified) detailing the objects imported, any errors encountered, and the job summary, which is essential for post-import review and troubleshooting.


Basic Syntax and Parameters of impdp

The impdp (Import Data Pump) utility is used to import data and metadata from dump files into an Oracle database. Understanding its syntax and core parameters is essential for managing data imports effectively. Below is the basic syntax and explanation of commonly used parameters:

Syntax

The impdp command is executed from the command line with the following structure:

impdp user/password@database DUMPFILE=export_data.dmp DIRECTORY=dpump_dir1 [other parameters]        

  • user/password@database: Specifies the user credentials and the target database. The user must have the necessary privileges to import data.
  • DUMPFILE: Specifies the name of the dump file(s) to be imported.
  • DIRECTORY: Defines the directory object pointing to the physical directory where the dump file is located.


Key Parameters in impdp


DUMPFILE (Mandatory)

  • Purpose: Names the dump file(s) containing the data to be imported.
  • Details: Multiple dump files can be specified with wildcards (e.g., export_data%U.dmp) for parallel processing.
  • Example:

impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp SCHEMAS=hr        

LOGFILE (Optional but recommended)

  • Purpose: Specifies the name of the log file for recording import details, errors, and a summary of the import operation.
  • Example:

impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp LOGFILE=import_log.log SCHEMAS=hr        

SCHEMAS, TABLES OR FULL (One Required)

  • Purpose: Defines the scope of the import.
  • Options:

-- SCHEMAS: Imports all objects from specified schemas --
# impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp SCHEMAS=hr

-- TABLES: Imports specific tables and related objects --
# impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp TABLES=hr.employees,hr.departments

-- FULL: Imports the entire database, including all schemas and objects --
# impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp FULL=Y        

TABLE_EXISTS_ACTION (Optional)

  • Purpose: Specifies how to handle existing tables during import when there is a name conflict.
  • Options: SKIP: Ignores existing tables (default behavior). REPLACE: Drops and recreates existing tables. APPEND: Adds data to existing tables. TRUNCATE Deletes existing rows before importing data.
  • Example:

impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp TABLES=employees TABLE_EXISTS_ACTION=REPLACE        

REMAP_SCHEMA (Optional)

  • Purpose: Imports objects into a different schema than the one they were exported from.
  • Example:

impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp REMAP_SCHEMA=hr:hr_bkp        

  • Use Case: Useful for cloning data to a new schema without affecting the original schema.

REMAP_TABLESPACE (Optional)

  • Purpose: Maps objects from one tablespace to another during import.
  • Example:

impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp REMAP_TABLESPACE=users:users_tbs        

  • Use Case: Helpful when the target database’s tablespace structure differs from the source.

PARALLEL (Optional)

  • Purpose: Specifies the number of active worker processes to speed up the import operation.
  • Details: The PARALLEL value should match the number of dump files or worker processes for optimal performance.
  • Example:

impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data%U.dmp PARALLEL=4 FULL=Y        

Additional (Optional) IMPDP Parameters

SQLFILE

  • Purpose: Generates an SQL script containing all the DDL statements that would have been executed during the import, without actually performing the import.
  • Example:

impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp SQLFILE=ddl_statements.sql SCHEMAS=hr        

  • Use Case: This is helpful for reviewing the DDL before importing or for generating object creation scripts for testing or deployment purposes. The script includes statements to recreate tables, indexes, constraints, and other objects

CONTENT

  • Purpose: Controls what data is imported by specifying if only metadata (object definitions), only data (table rows), or both should be imported.
  • Options: ALL (default): Imports both metadata and data. DATA_ONLY: Imports only table rows without object definitions. METADATA_ONLY: Imports only object definitions without table data.
  • Example:

impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp CONTENT=METADATA_ONLY SCHEMAS=hr        

  • Use Case: CONTENT=METADATA_ONLY can be useful if you need to recreate database structures without data, while DATA_ONLY is valuable for restoring just the data in existing structures.

NETWORK_LINK

  • Purpose: Allows for a direct import from a remote database without creating dump files.
  • Example:

impdp user/password DIRECTORY=dpump_dir1 NETWORK_LINK=remote_db_link SCHEMAS=hr        

  • Use Case: Ideal for migrating data between databases over a database link, eliminating the need for intermediate dump files. This can be a faster, space-saving alternative for data migrations.


Handling Large Imports with impdp

When importing large datasets, impdp (Import Data Pump) offers several parameters to manage and optimize the import process, helping to reduce time, manage storage, and avoid interruptions. Here are some effective techniques for handling large imports:

Use PARALLEL Parameter

  • Purpose: Runs multiple worker processes simultaneously, which can significantly speed up large imports.
  • Recommendation: Set PARALLEL equal to the number of CPU cores or I/O channels available on the server for balanced performance. Ensure that multiple dump files are available (DUMPFILE=large_export%U.dmp) to allow parallel processing.

Set TRANSFORM=SEGMENT_ATTRIBUTES:N

  • Purpose: Disables segment attributes during import, preventing storage settings from being applied and potentially reducing import time.
  • Use Case: Useful when importing data into a database with different storage structures or when default storage settings are sufficient, making the import faster.

Use DATA_OPTIONS=SKIP_CONSTRAINT_ERRORS

  • Purpose: Allows the import job to skip rows that cause constraint violations, helping to avoid interruptions during large data imports.
  • Use Case: Ideal when constraints can be checked post-import, allowing the job to complete without manually resolving constraint errors mid-import.

Pre-Estimate Import Size with ESTIMATE_ONLY=Y

  • Purpose: Calculates the expected import size without executing the job, ensuring adequate storage is available.
  • Use Case: Useful for large imports where available storage is limited, helping to prevent job failures due to insufficient space.

Monitor and Manage with STATUS and Views

  • Purpose: Monitors real-time progress using STATUS and views like DBA_DATAPUMP_JOBS.
  • Use Case: Checking progress with STATUS (every 60 seconds in this example) helps identify bottlenecks and manage resources dynamically during a large import.


Monitoring and Managing EXPDP Jobs

Monitoring and managing impdp jobs effectively is crucial for ensuring that import operations run smoothly and for troubleshooting any issues that may arise during the process. Here are key methods and tools for managing impdp jobs, along with examples:

Using the STATUS parameter:

  • Purpose: Displays the progress of the import job, providing a snapshot of the job status at regular intervals.
  • Example: Monitor job progress every 60 seconds:

impdp user/password DIRECTORY=dpump_dir1 DUMPFILE=export_data.dmp SCHEMAS=hr STATUS=60        

Monitoring with Data Pump Views:

  • Views like DBA_DATAPUMP_JOBS and DBA_DATAPUMP_SESSIONS provide real-time information about active Data Pump jobs.
  • Example: Check the status of all active Data Pump jobs:

SELECT owner_name, job_name, operation, job_mode, state FROM dba_datapump_jobs;        

Interactive Command-Line Commands:

  • After starting an import job, you can attach to it interactively using the ATTACH parameter to manage the job.
  • Example: Attach to an existing Data Pump job:

impdp user/password ATTACH=hr_import_job        

Details: Once attached, you can use commands like STOP_JOB, START_JOB, KILL_JOB, and STATUS interactively to control the job.

-- Pause a running import job --
Import> STOP_JOB=IMMEDIATE

-- Resume the job later --
Import> CONTINUE_CLIENT

-- Forcefully terminate a job --
Import> KILL_JOB        

  • Notes: [^] STOP_JOB is useful when you need to temporarily pause a job to free up system resources. The job can be resumed without data loss. [^] KILL_JOB is a last resort when a job is stuck and cannot be stopped gracefully. It ensures that the job does not leave behind any remnants that could affect future Data Pump operations

Reviewing the Import Logs is also another way

  • Purpose: Captures detailed information on each object processed, errors encountered, and a summary of the entire import.
  • Details: The log file is essential for troubleshooting, as it provides a complete record of the import job, including messages about skipped or modified objects, errors, and any issues related to constraints or storage.


Common Import Errors and Troubleshooting Tips

Here are some common errors encountered during impdp (Import Data Pump) operations and strategies to troubleshoot them effectively. Knowing these issues and resolutions can help avoid interruptions and ensure a smoother import process.

Error: ORA-39002: invalid operation

  • Cause: This error typically occurs if the required DIRECTORY parameter is missing or improperly specified.
  • Resolution: Verify that the DIRECTORY parameter points to an existing directory object with correct permissions.

Error: ORA-39087: directory name is invalid

  • Cause: The specified directory object may not exist or the user lacks the necessary read/write privileges.
  • Resolution: Check the directory name in DBA_DIRECTORIES to confirm it exists and ensure permissions are granted.

Error: ORA-39166: Object already exists

  • Cause: This error arises when an object already exists in the target database, and the import job attempts to create it again.
  • Resolution: Use the TABLE_EXISTS_ACTION parameter to specify how to handle existing tables (e.g., SKIP, REPLACE, APPEND, TRUNCATE).

Error: ORA-31693: Table data object failed to load/unload

  • Cause: This typically happens when there is an issue loading data due to constraints, lack of space, or corrupted data.
  • Resolution: Check the log file for specific details on the error. If space is the issue, ensure there is enough tablespace or use the DATA_OPTIONS=SKIP_CONSTRAINT_ERRORS parameter to bypass constraint violations.

Error: ORA-39014: One or more workers have prematurely exited

  • Cause: This error can happen due to network issues, resource limitations, or corruption in dump files.
  • Resolution: Check the log file for further information. If the job is using PARALLEL, try reducing the value to lessen resource demand.

Error: ORA-39181: Only partial data imported due to fine-grained access control

  • Cause: Occurs when importing data into a schema protected by Virtual Private Database (VPD) or Fine-Grained Access Control (FGAC).
  • Resolution: Import using a privileged user or disable FGAC/VPD policies temporarily during the import, if allowed.


Stay Tuned for the Final Part!

With Part 3 on impdp, you've now mastered the essential import techniques and troubleshooting strategies for Oracle Data Pump. In the final part of this series, we’ll dive into Performance Tuning Techniques for expdp and impdp, sharing advanced tips to make your Data Pump operations faster and more efficient.

Stay tuned to unlock the secrets of high-performance Data Pump jobs—coming soon! ??

Happy Learning :)



要查看或添加评论,请登录

Aman Pandey的更多文章

社区洞察

其他会员也浏览了