About SSIS
Ankita Sharda
Senior Human Resource Executive || Hired for Adidas & KPMG (ex-clients) | MBA, Talent Acquisition- Corporate HR.
What is SSIS?
SQL Server Integration Service (SSIS) is a component of the Microsoft SQL Server database software that can be used to execute a wide range of data migration tasks. SSIS is a fast & flexible data warehousing tool used for data extraction, loading and transformation like cleaning, aggregating, merging data, etc.
It makes it easy to move data from one database to another database. SSIS can extract data from a wide variety of sources like SQL Server databases, Excel files, Oracle and DB2 databases, etc.
SSIS also includes graphical tools & wizards for performing workflow functions like sending email messages, FTP operations, data sources, and destinations.
- SSIS tool helps you to merge data from various data stores
- Automates Administrative Functions and Data Loading
- Populates Data Marts & Data Warehouses
- Helps you to clean and standardize data
- Building BI into a Data Transformation Process
- Automating Administrative Functions and Data Loading
- SIS contains a GUI that helps users to transform data easily rather than writing large programs
- It can load millions of rows from one data source to another in very few minutes
- Identifying, capturing, and processing data changes
- Coordinating data maintenance, processing, or analysis
- SSIS eliminates the need of hardcore programmers
- SSIS offers robust error and event handling
Following are components of SSIS architecture:
- Control Flow (Stores containers and Tasks)
- Data Flow (Source, Destination, Transformations)
- Event Handler (sending of messages, Emails)
- Package Explorer (Offers a single view for all in package)
- Parameters (User Interaction)
Let's understand each component in detail:
1. Control Flow
Control flow is a brain of SSIS package. It helps you to arranges the order of execution for all its components. The components contain containers and tasks which are managed by precedence constraints.
2. Precedence Constraints
Precedence constrain are package component which direct tasks to execute in a predefined order. It also defines the workflow of the entire SSIS package. It controls the execution of the two linked tasks by executing the destination tasks based on the result of the earlier task — business rules which are defined using special expressions.
3. Task
A 'Task' is an individual unit of work. It is the same as a method/function used in a programming language. However, in SSIS, you don't use coding methods. Instead, you will use drag & drop technique to design surface and to configure them.
4. Containers
The container is units for grouping tasks together into units of work. Apart from offering visual consistency, it also allows you to declare variables and event handlers which should be in the scope of that specific container.
Four types of containers in SSIS are:
- A Sequence Container
- A For Loop Container
- Foreach Loop Container
Sequence Container: allows you to organize subsidiary tasks by grouping them, and allows you to you apply transactions or assign logging to the container.
For loop container:Provides the same functionality as the sequence Container except that it also lets you run the tasks multiple times. However, it is based on an evaluation condition, like a looping from 1 to 100.
For each Loop Container: It also allows looping. But the difference that instead of using a condition expression, loop s done over a set of objects, likes files in a folder.
5. Data Flow
The main use of the SSIS tool is to extract data into the server's memory, transform it, and write it to another destination. If Control Flow is the brain, Data Flow is the heart of SSIS
6. SSIS Packages
Another core component of SSIS is the notion of a package. It is a collection of tasks which execute in an orderly fashion. Here, president constraints help manage the order in which the task will execute.
A package can help you to saves files onto a SQL Server, in the msdb or package catalog database. It can save as a .dtsx file, which is a structured file very similar to .rdl files are to Reporting Services.