SSIS : Fetch And Import Excel Files with Dynamic Name , File Extension From Ftp Server to Sql Server with various Schema , Table Name and Columns .
Ehsan khosravi esfarjani
Azure Data Engineer | Databricks Developer | SSIS Developer | SQL Developer
I had a project some time ago that client wanted :
- transfer excel files from ftp server to local directory
Note : excel filename has dieffrent name per day like :
telecom4G + 2020_2_18 (getdate) = telecom4G2020_2_18
2. excel files must transfer to daily directory in base directory like :
Base directory : D:\FtpDir\
Daily directory : D:\FtpDir\2020_2_18
3. Excel files must be store in various sql tables with various schema and columns.
So I did it like this by ssis :
It’s our ingredients for cooking :
Sql Server 2012 or above.
SSDT 2012 or above.
We Need to create three audit tables in sql server :
- Folders
- Naming
- File Logging
Note : you must create target tables that filled by excel files.
Steps :
open sql server management studio and copy below code in new query window then execute this script :
Create database Telecom go USE [Telecom] GO CREATE TABLE [dbo].[FileLogging]( [FileName] [varchar](100) NULL, [IsExist] [bit] NULL, [IsDownload] [bit] NULL, [IsMatch] [bit] NULL [IsInsert] [bit] NULL, [FileSize] [decimal](4, 2) NULL, [CreateDate] [datetime] NULL, [ModiFiedDate] [datetime] NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Folders]( [FolderName] [varchar](30) NULL, [IsExist] [bit] NULL, [IsCreate] [bit] NULL, [CreatedDate] [datetime] NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Naming]( [FileName] [varchar](50) NULL, [TableName] [varchar](50) NULL, [SchemaName] [varchar](50) NULL, [FileExtention] [varchar](50) NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Telecom2G_Tbl]( [CellName] [nvarchar](255) NULL, [Distance(KM)] [float] NULL, [Tech Tag] [nvarchar](255) NULL, [Category] [nvarchar](255) NULL, [Technology] [nvarchar](255) NULL, [Date] [datetime] NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Telecom3G_Tbl]( [CellName] [nvarchar](255) NULL, [Location] [nvarchar](255) NULL, [BTS Installation Type] [nvarchar](255) NULL, [Tech Tag] [nvarchar](255) NULL, [Category] [nvarchar](255) NULL, [Technology] [nvarchar](255) NULL, [Date] [datetime] NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Telecom4G_Tbl]( [Distance(KM)] [float] NULL, [Tech Tag] [nvarchar](255) NULL, [Category] [nvarchar](255) NULL, [Technology] [nvarchar](255) NULL, [Date] [datetime] NULL ) ON [PRIMARY] GO INSERT [dbo].[Naming] ([FileName], [TableName], [SchemaName], [FileExtention], [sheetname]) VALUES (N’Telecom2G_Tbl’, N’Telecom2G’, N’dbo’, N’.xlsx’, NULL) INSERT [dbo].[Naming] ([FileName], [TableName], [SchemaName], [FileExtention], [sheetname]) VALUES (N’Telecom3G_Tbl’, N’Telecom3G’, N’dbo’, N’.xlsx’, NULL)
INSERT [dbo].[Naming] ([FileName], [TableName], [SchemaName], [FileExtention], [sheetname]) VALUES (N’Telecom4G_Tbl’, N’Telecom4G’, N’dbo’, N’.xlsx’, NULL)
Create SSIS Project :
Write project name and select project directory :
Create base directory that you must store excel files from ftp server
Create variables that we need for daily directory :
for DateFormat you must put below code in expression builder :
(DT_WSTR, 4) YEAR(DATEADD(“day”,-1,GETDATE())) +”_”+RIGHT(“0” + (DT_WSTR, 2) DATEPART(“MM”, DATEADD(“day”, 0, GETDATE())),2) + ”_” +RIGHT(“0” + (DT_WSTR, 2) DATEPART(“DD”, DATEADD(“day”, 0, GETDATE())),2)
Drag & drop script task in control flow and rename it to “Check Directory Existing”
open script task and open ReadWriteVariables and select all variables
for ensure Check exist daily directory folder in base directory and if not exist then create double click on "Edit Script" and paste below code :
public void Main() { string Dirpath = Dts.Variables[“User::LocalDiectoryPath”].Value.ToString() + “\\” + Dts.Variables[“User::DateFormat”].Value.ToString(); bool exists = System.IO.Directory.Exists(Dirpath); if (!exists) { System.IO.Directory.CreateDirectory(Dirpath); Dts.Variables[“User::IsCreateFolder”].Value = true; Dts.Variables[“User::IsExistFolder”].Value = false; } else { Dts.Variables[“User::IsCreateFolder”].Value = false; Dts.Variables[“User::IsExistFolder”].Value = true; } Dts.TaskResult = (int)ScriptResults.Success; } #region ScriptResults declaration /// <summary> /// This enum provides a convenient shorthand within the scope of this class for setting the /// result of the script. /// /// This code was generated automatically. /// </summary> enum ScriptResults { Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success, Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure }; #endregion }
Right Click on connection manager and click on “New OLEDB CONNECTION”
then will open configure oledb connection , you must click on “New” then appear connection manager that must fill with server name , authentication mode , database name “Telecome” and click on OK button .
drop sql task on control flow for create log record in folders table and rename to “Log Directory Existing”.
Configure properties by below values :
put this script in sql statement :
insert into Folders (foldername , IsExist , IsCreate , CreateDate) values(?,?,?,getdate())
Define new variable , name : FileName , DataType : object that must keep file names.
Drop sql task on panel and rename it to “Get FileNames List”.
Note: it use for Getting list of name files from Db that must fetch from ftp server.
put this script in sql statement :
SELECT FileName FROM Naming
Add FileNames variable to ResultSet
Define new variables as “FileName” , “TableName” , “SchemaName”,”FileExtention”.
Drop foreach loop container on panel and Configure like below pictures :
Get deltais each filename like FileExtention , SchemaName , TableName that you can achieve it by query from database with where clause that fill by file name. Drop sql task on panel and rename is to “Get Details per File” then Configure like below picture :
Put this script in sql statement :
select tablename , schemaname , fileextention from Naming where filename = ?
This step we must check exist excel file in ftp server directory , we must connect to ftp server and authenticate and ensure target excel file is exist.
define new variables : DailyFileName , UserName , Password, IsExistFileInFtp.
For declare DailyFileName default value must add below code to expression builder :
@[User::FileName]+ @[User::DateFormat]+ @[User::FileExtention]
Drop script task on control flow panel and rename it to “Check Exists File in Ftp”
open script task and open ReadWriteVariables and select DailyFileName , UserName , Password, IsExistFileInFtp variables.
click on “Edit Script” and paste below code :
public void Main() { string DailyFileName , UserName , Password; DailyFileName = Dts.Variables[“User::DailyFileName”].Value.ToString(); UserName = Dts.Variables[“User::UserName”].Value.ToString(); Password = Dts.Variables[“User::Password”].Value.ToString(); string serverpath = string.Format(“ftp://255.255.255.255/DailyReports/Today/{0}", DailyFileName); FtpWebRequest request = (FtpWebRequest)WebRequest.Create(serverpath); request.Credentials = new NetworkCredential(UserName, Password); request.UsePassive = true; request.KeepAlive = true; request.UseBinary = true; request.Method = WebRequestMethods.Ftp.DownloadFile; try { using (FtpWebResponse response = (FtpWebResponse)request.GetResponse()) { Dts.Variables[“User::IsExistFileInFtp”].Value = true; } } catch (Exception) { Dts.Variables[“User::IsExistFileInFtp”].Value = false; } Dts.TaskResult = (int)ScriptResults.Success;
}
This step we have two condition first exist daily excel file in ftp server , second not exist daily excel file in ftp server .
- if exist excel file in ftp server drop sql task in control flow panel and rename it to “Exists File in Ftp server”. we need create log in file logging table for record status of process.
paste below code in sql statement property :
insert into [FileLogging] ([FileName] ,[IsExist] ,[CreateDate] ) values(?,?,getdate())
Right click on precedence constraint line and set it like below picture :
copy below code in expression builder :
@[User::IsExistFileInFtp]==true
2. if not exist excel file in ftp server drop sql task in control flow panel and rename it to “Not Exists File in Ftp server”. we need create log in file logging table for record status of process.
paste below code in sql statement property :
insert into [FileLogging] ([FileName] ,[IsExist] ,[CreateDate] ) values(?,?,getdate())
Right click on precedence constraint line and set it like below picture :
copy below code in expression builder :
@[User::IsExistFileInFtp]==false
If exist excel file its time that transfer files from ftp server to local directory
declare three variables : DailyLocalDirectoryPath , DailyFileInFtpPath , FtpDirectoryPath .
so we need drop “FTP TASK” on control flow panel and rename it to “Fetch Files From Ftp to LocalPath” and set properties like below picture :
open “FTP TASK” and create ftp connection :
setting in file transfer tab :
if exist file in ftp server update log record in IsExist column (File Logging Table )with true value and new step will be ftp task that but transfer file from ftp to local directory. Note : when transferring file from ftp server occurs problem like( disconnect , network or timeout ) set false step status and check clause then update log record in IsDownload column (In file logging table )with false value by sql task component and move to next file in foreach loop.
if not exist file in ftp server then update log record (In file logging table) by sql task and move to next file in foreach loop.
this step drop sql task in panel and connect “ftp task” to it and set value precedence constraint with failure for when that occurs mistake in transferring file.
rename “sql task” to “Fail Transfer File From Ftp Server” and configure it like picture below
paste below code in sql statement:
update [FileLogging] set IsDownload = 0 where FileName = ?
And we arrive to last and most important step where that excel file downloaded and must insert in related table in database.
Right click on connection manager and create ADO.NET CONNECTION
Create New connection and set properties like below picture :
Rename ADO.NET CONNECTION rename “DB_Conn”.
Drop script task on panel and rename it to “Check Exist File in LocalPath and Routing”.
Double click on script task and add two variables to ReadWriteVariables :
Double click on Edit Script and past below code on it :
public void Main() { SqlConnection myADONETConnection = new SqlConnection(); myADONETConnection = (SqlConnection)(Dts.Connections[“DB_Conn”].AcquireConnection(Dts.Transaction) as SqlConnection); List<string> SqlColumnList = new List<string>(); List<string> ExcelColumnList = new List<string>(); bool IsMatched; long FileSize = 0; decimal FileSizeMB = 0; String SchemaName = Dts.Variables[“User::SchemaName”].Value.ToString(); String TableName = Dts.Variables[“User::TableName”].Value.ToString(); String FileName = Dts.Variables[“User::DailyFileName”].Value.ToString(); String FolderPath = Dts.Variables[“User::DailyLocalDirectoryPath”].Value.ToString(); String curFile = FolderPath + “\\” + FileName; String sheetname; if (File.Exists(curFile)) { // Get File Size FileSize = new System.IO.FileInfo(curFile).Length; FileSizeMB = Math.Round(Decimal.Divide(FileSize, 1048576), 1); // Update record log — IsDownload file String UpdateIsDownload = “”; UpdateIsDownload = “update FileLogging set IsDownload = “ + ‘1’ + “ , “ + “FileSize =” + FileSizeMB + “ where FileName = ‘“ + FileName + “‘“; SqlCommand Updatecmd = myADONETConnection.CreateCommand(); Updatecmd.CommandText = UpdateIsDownload; Updatecmd.ExecuteNonQuery(); //Create Connection to excel for get data String ConStr; String HDR; HDR = “YES”; ConStr = “Provider=Microsoft.ACE.OLEDB.12.0;Data Source=” + curFile + “;Extended Properties=\”Excel 12.0;HDR=” + HDR + “;IMEX=0\””; OleDbConnection cnn = new OleDbConnection(ConStr); //Get Sheet Names list cnn.Open(); DataTable dtSheet = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null); sheetname = “”; //Get Row Count for each sheet foreach (DataRow drSheet in dtSheet.Rows) { if (drSheet[“TABLE_NAME”].ToString().Contains(“$”)) { sheetname = drSheet[“TABLE_NAME”].ToString(); OleDbCommand oconn = new OleDbCommand(“select top 1 * from [“ + sheetname + “]”, cnn); OleDbDataAdapter adp = new OleDbDataAdapter(oconn); DataTable dt = new DataTable(); adp.Fill(dt); cnn.Close(); string ExcelHeaderColumn = “”; ExcelColumnList.Clear(); SqlColumnList.Clear(); for (int i = 0; i < dt.Columns.Count; i++) { if (i != dt.Columns.Count — 1) { ExcelHeaderColumn += “” + dt.Columns[i].ColumnName + “” + “,”; ExcelColumnList.Add(dt.Columns[i].ColumnName); } else { ExcelHeaderColumn += “” + dt.Columns[i].ColumnName + “”; ExcelColumnList.Add(dt.Columns[i].ColumnName); } } string SqlColumns = “SELECT Column_Name FROM Information_schema.Columns WHERE table_Schema = ‘“ + SchemaName + “‘“ + “ AND table_name = ‘“ + TableName + “‘“; using (SqlCommand command = myADONETConnection.CreateCommand()) { command.CommandText = SqlColumns; using (SqlDataReader reader = command.ExecuteReader()) { while (reader.Read()) { SqlColumnList.Add(reader.GetString(0)); } } } // check match sql table colmuns vs excel colmuns var DiffExcel = SqlColumnList.Except(ExcelColumnList).ToList(); if (DiffExcel.Count() == 0) { IsMatched = true; } else { IsMatched = false; } if (IsMatched == true) { // Update record log — IsMatch file String UpdateIsMatch = “”; UpdateIsMatch = “update FileLogging set IsMatch = “ + ‘1’ + “ , “ + “FileSize =” + FileSizeMB + “ where FileName = ‘“ + FileName + “‘“; SqlCommand UpdatecmdMatch = myADONETConnection.CreateCommand(); UpdatecmdMatch.CommandText = UpdateIsMatch; UpdatecmdMatch.ExecuteNonQuery(); // Get matched sheet name OleDbConnection cnn1 = new OleDbConnection(ConStr); cnn1.Open(); OleDbCommand oconn1 = new OleDbCommand(“select * from [“ + sheetname + “]”, cnn1); OleDbDataAdapter adp1 = new OleDbDataAdapter(oconn1); DataTable dt1 = new DataTable(); adp1.Fill(dt1); cnn1.Close(); using (SqlBulkCopy BC = new SqlBulkCopy(myADONETConnection)) { BC.DestinationTableName = SchemaName + “.” + TableName; foreach (var column in dt1.Columns) BC.ColumnMappings.Add(column.ToString(), column.ToString()); BC.WriteToServer(dt1); } // Update record log — IsInsert file String UpdateIsInsert = “”; UpdateIsMatch = “update FileLogging set IsInsert = “ + ‘1’ + “ , “ + “FileSize =” + FileSizeMB + “ where FileName = ‘“ + FileName + “‘“; SqlCommand UpdatecmdInsert = myADONETConnection.CreateCommand(); UpdatecmdInsert.CommandText = UpdateIsInsert; UpdatecmdInsert.ExecuteNonQuery(); } } } } Dts.TaskResult = (int)ScriptResults.Success;
}
Description of what we did : After check exist file in ftp server and transfer it in local directory must use script component , you can to see steps in below :
- update log record related file and set value in column Isdowload( In file logging table) with true.
- Get file size and update log record related excel file and set value in column FileSize in logging table.
- connecting to excel file and get file sheet names by foreach loop with clause start with $.
- in nested foreach loop store columns every sheet and store in list.
- Get sql table columns name that related to excel file and store in list.
- Compare two lists with together :
if not match with together move to next sheet.
if match with together :
- Update log record related excel file and set value IsMatch column(in file logging table )with true.
- This step occurs Bulk Insert from excel file’s sheet that is match with related sql table.
- update log record related excel file and set IsInsert column(in file logging table ) with true.
so we it done , thank you.