SSIS
Darshika Srivastava
Associate Project Manager @ HuQuo | MBA,Amity Business School
SSIS Tutoria
SSIS tutorial provides basic and advanced concepts of SQL Server Integration Services. Our SSIS tutorial is designed for beginners and professionals.
SQL Server Integration Service?is a fast and flexible data warehousing tool used for data extraction, transformation, and data loading. It makes it easy to load the data from one database to another database such as SQL Server, Oracle, Excel file, etc.
In this tutorial, we will discuss the following topics:
What is SSIS
What is Data Integration
Why SSIS
How SSIS Works
Requirements For SQL Server Integration Service
What is The SSIS Package
SSIS Tasks
Example of Data Flow Task
Example of Execute SQL Task
What is SSIS?
SSIS stands for SQL Server Integration Services.
It is a component available in the Microsoft SQL Server database software used to perform a wide range of integration tasks.
It is a data warehousing tool used for data extraction, loading the data into another database, transformations such as cleaning, aggregating, merging data, etc.
SSIS tool also contains the graphical tools and window wizards workflow functions such as sending email messages, ftp operations, data sources.
SSIS is used to perform a wide range of transformation and integration tasks. As a whole, the SSIS tool is used in data migration.
SSIS is a tool mainly used to perform two functionalities:
Data Integration
SSIS performs data integration by combining the data from multiple sources and provides unified data to the users.
Workflow
Workflow can be used to perform several things. Sometimes we need to execute some specific steps or a particular path which is either based on the time period or the parameter passed to the package or the data queried from the database. It can be used to automate the maintenance of SQL Server databases and provides the update to the multidimensional analytical data.
What is Data Integration?
Data Integration is a process that you follow to integrate the data from multiple sources. The data can be either heterogeneous data or homogeneous data. The data can be structured, semi-structured, or unstructured. In Data Integration, the data from different dissimilar data sources integrate to form some meaningful data.
Some methods are used to achieve data integration:
Data Modelling:?In Data Modelling, you need first to create the data model and perform operations on it.
Data Profiling:?Data Profiling is a process which is used to check the errors, inconsistency, or variations in the available data. Data Profiling ensures the data quality where data quality refers to the accuracy, consistency, and completeness of data.
Advantages of Data Integration:
Reduce data complexity
It reduces data complexity which means that the data can be delivered to any system. Data Integration maintains the complexity, streamlined connections, and making it easy to deliver the data to any system.
Data integrity
Data integrity plays a major role in data integration. It deals with cleansing and validating the data. Everyone wants high quality and robust data, so to achieve this data integration concept is used. Data integration is helpful in removing errors, inconsistency, and duplication.
Easy data collaboration
Accessibility comes under data collaboration. Accessibility means that the data can be easily transformed, and people can easily integrate the data into projects, share their results, and keep the data up-to-date.
Smarter business decisions
It also provides you to make smarter decisions. An integrated data refers to the transmit process within a company so that we can understand the information more easily. An integrated data is much easier and informative.
Why SSIS?
SSIS is used because of the following reasons:
Data can be loaded in parallel to many varied destinations
SSIS is used to combine the data from multiple data sources to generate a single structure in a unified view. Basically, it is responsible for collecting the data, extracting the data from multiple data sources, and merging into a single data source.
Removes the need of hard core programmers
SSSIS is a platform that has the capability to load a large amount of data from excel to a SQL Server database.
Integration with other products
SSSIS tool provides tight integration with other products of Microsoft.
Cheaper than other ETL tools
SSSIS tool is cheaper than most of the other tools. It can resist with other base products, their manageability, business intelligence, etc.
Complex error handling within dataflows
SSSIS allows you to handle the complex error within a dataflow. You can start and stop the dataflow based on the severity of the error. You can even send an email to admin when some error occurs. When an error is resolved, then you can pick the path in between the workflow.
How SSIS works?
We know that SSIS is a platform for two functions, i.e., Data Integration and workflow. Both the tasks Data transformations and workflow creation are carried by using the SSIS package. SSIS package consists of three components:
Operational data
Operational data is a database used to integrate the data from multiple data sources to perform additional operations on the data. It is the place where the data is housed for current operation before sending to the data warehouse for storing, reporting, or archiving.
ETL
ETL is the most important process in SSIS tool. ETL is used to Extract, Transform, and Load the data into a data warehouse.
ETL is a process responsible for pulling out the data multiple data sources, transforming the data into useful data, and then storing the data into a data warehouse. The data can be in any format xml file, flat file, or any database file.
It also ensures that the data stored in the data warehouse is relevant, accurate, high quality, and useful to the business users.
It can be easily accessed so that the data warehouse can be used effectively and efficiently.
It also helps the organization to make data-driven decisions by retrieving the structured and unstructured data from multiple data sources.
An ETL is a three-word concept, but it is divided into four phases:
Capture:?Capture phase is also known as Extract phase. In this phase, it picks the source data or metadata, and the data can be in any format such as xml file, flat file, or any database file.
Scrub:?In this phase, the original data is checked. It checks the data, whether it consists of any errors or not. It checks for the errors or inconsistency of data by using some artificial intelligence techniques. In short, it verifies whether the quality of the product is met or not.
Transform:?It is the third phase in ETL. Transformation is the process in which the original format is converted into a required format that you want. Transformation is modelling or changing the data according to the user requirements. The changes can be either change in the number of columns or rows.
Load and index:?The fourth phase is Load and index. It loads the data and validates the number of rows that have been processed. Once the loading of data is completed, the indexing is used. Indexing helps you to track the number of rows that are loaded in the data warehouse. Indexing also helps to identify the data, whether it is in the correct format or not.
Data Warehouse
Data warehouse is a single, complete, and consistent store of data which is formulated by combining the data from multiple data sources.
Difference between Database and Data warehouse
The answer can be yes as well as no. Both the database and data warehouse have a large unit of data and similar physical representation but the response time of complex queries in Data warehouse is faster than the database.
Requirements for SQL Server Integration Services
The following are the requirements to install the SQL Server Integration Services:
Install the SQL Server
Install the SQL Server Data Tools
Follow the below steps to install the SQL Server Data tools:
Step 1:?Click on the link?https://docs.microsoft.com/en-us/sql/ssdt/previous-releases-of-sql-server-data-tools-ssdt-and-ssdt-bi?view=sql-server-2017?to download the SQL Server data tools.
Step 2:?When you click on the above link, the screen appears shown below:
In the above screen, select the version of SSDT that you want to install.
Step 3:?Once the downloading is completed, run the downloaded file. When you run the downloaded file, the screen appears which is shown below:
l