My SQL Server notes and latest BI enhancements in SQL Server 2017.
Parag Ravka
Lead PowerBI Engineer at Tiffany & Co. | Business Intelligence | Data Analytics | Power Platform Admin | Data Storyteller | Microsoft PowerBI certified ????
Hey guys, Microsoft has introduced a gamut of interesting features and BI enhancements in the latest edition of SQL Server 2017. One-stop shop for all your data needs and BI work. In addition to this, I am sharing my notes in which I have included almost all the major features and understanding for anybody to get a good grip on concepts present in SQL Server.
Automatic Database tuning introduced by Microsoft gives the leverage to query optimzer to overcome the pitfalls of the query you are running.Also, some new string functions are introduced like concat_ws, trim and within group.
If we talk about changes in Microsoft BI stack then in SSIS ( SQL Server Integration Services) packages can run on multiple machines, SSAS ( SQL Server Integration Services) now has better power query features, this tool uses power pivot, pivot table and even PowerBI) .SSRS( SQL Server Reporting Services) now has options of commenting and adding attachments.
? Creating, alter and drop databases:
When we create a database two files are created in SQL server system generated databases and user designed databases.
sample1.mdf, sample1_log.ldf
mdf - data file which contains actual data,ldf - transaction log file used to recover the database
-alter database name modify name new_name
-drop database name
-deleting a database deletes both LDF and MDF files; we cannot drop a database if it is currently in use.
? Creating and working with tables:
Create table name
( ID int Not Null Primary Key
Gender nvarchar (50) Not Null)
? Difference between char, varchar , nvarchar , nchar:
char (20) : allocates 20 characters of memory, size of char is fixed.
varchar(20) : allocates only the word memory like john, it will allocate only 4 bytes it has variable length.
nchar: it stores 2 bytes of memory of each character, supports uni and non uni code characters.
nvarchar: it stores 2 bytes of memory of each character, supports uni and non uni code characters.
? Foreign Key Concept:
Foreign key are used to enforce database integrity, it’s the column in the table which points to a primary key in another table, it prevents invalid data from being inserted into the foreign key column.
? Foreign Key Syntax:
Alter table table_name ADD constraint tablename_columnname_FK
Foreign key (foreignkey column) references primarykeytable (primarykey column)
? Default Constraint syntax:
ALTER table tablename
ADD Constraint DF_tablename_columnname
Default value for columnname
? Entity and Referential Integrity syntax:
Cascading referential integrity constraint allows to define the actions SQL server should user attempts to delete or update a key to which an existing foreign keys points.
? Check constraint:
ALTER TABLE table_name
ADD CONSTRAINT constriant_name Check {expression}
? Unique key constraint:
Unique constraint is to enforce uniqueness of a column i.e. column shouldn't allow any duplicate values a table can have only one primary key. If you want to enforce uniqueness on 2 or more columns, then we use unique key constraint. Primary key does not allow nulls whereas unique key allows one null in it.
Alter table tablename
Add Constraint UQ_tablename_colname Unique(email)
? Fully qualified name:
[database-name]. [schema_name]. [table name]
? Select statement:
We can perform multiple things through select statement like selecting specific columns, distinct columns, wild card operators, and/or clause etc.
? Joins and Advanced Joins:
JOINS in SQL server are used to retrieve data from 2 or more related tables. Its general tables are related to each other using foreign key constraints
INNER JOIN returns only the matching rows between both the tables. non matching rows are eliminated.
LEFT JOIN returns all the matching rows + non matching rows from the left table.
RIGHT JOIN returns all the matching rows + non matching rows from the right table.
FULL OUTER JOIN returns all rows from both the left and right tables, including non-matching rows.
CROSS JOIN produces the Cartesian product of the 2 tables involved in the join
? Group by clause:
Group clause is used to group a selected set of rows into a set of summary rows by the values of one or more columns. It is always used in conjunction with one or more aggregate functions. WHERE clause is used to filter rows before aggregation, where as HAVING clause is used to filter groups after aggregations.
? Advanced Joins for non-matching rows:
Non matching rows from table1:
SELECT columns_list
from table1
left join table2
on table.id = table2.id
where table1.id IS NULL
Non matching rows from table2:
SELECT columns_list
from table1
right join table2
on table.id = table2.id
where table2.id IS NULL
Non matching rows from table1, table2:
SELECT columns_list
from table1
full join table2
on table.id = table2.id
where table2.id IS NULL or table1.id IS NULL
? SELF JOIN clause:
Joining a table to itself is called a self join.
Select column1 , column2
from table1 a , table1 b
where a.id = b.id
? Different ways to replace NULL values in the queries:
ISNULL( column name , "substituted value")
COALESCE(column_name , "substituted value")
CASE
WHEN column_name IS NULL THEN "no manager" ELSE column_name
END as manager
? COALESCE() function:
Returns the first NON NULL value
SELECT id, coalesce (firstname, middlename, lastname) as name
from tablename {this column list is of priority list}
? Except Operator:
It basically returns unique rows from the left query that aren’t in the right query results, the number and the order of the columns must be the same in both the queries, data types must be same or compatible, it is similar to minus operator.
Select id, name from table1
Except
Select id, name from table2
? Difference between NOT IN and Except Operator:
Except filters duplicates and returns only distinct rows from the left query that aren’t in the right query result, whereas in NOT IN does not filter the duplicates.
Except operator excepts the same number of columns in both the queries, whereas NOT IN compares a single column from the outer query with a single column from the sub query.
? Intersect Operator:
Intersect operator retrieves the common records from both the left and the right query of the intersect operator, same can be done by inner join.
? Difference between Except, Intersect and Union operator:
UNION operator returns all the unique rows from both the left and right of the query. Union ALL includes the duplicates as well.
INTERSECT operator retrieves the common unique rows from both the left and right query.
EXCEPT operator returns unique rows from the left query that aren’t present in the right query results.
? UNION and UNION ALL clause:
UNION gives matching rows only once in the result set
UNION ALL gives matching rows twice in the result set
UNION has to perform distinct sort to remove duplicates, which makes it less fast than UNION ALL, ORDER BY clause should be used only on the last SELECT statement in the UNION query.
UNION combines the result set of two or more select queries into a single result set which includes all the rows from all the queries in the union, whereas JOINS , retrieve data from two or more tables based on logical relationship between the tables.
In short, UNION combines rows from 2 or more tables, where JOINS combine columns from 2 or more tables.
? Select Into in SQL server:
Select Into statement in SQL server, selects data from one table and inserts it into a new table.
Select Into statement copy all rows and columns from an existing table into a new table. This is extremely useful when you want to make a backup copy of the existing table.
Select * Into Employeebackup
From employees
Inner join departments
On employees.deptid = department.deptid
? Table value parameters in SQL server:
Table valued parameter allows a table to be passed as a parameter to a stored procedure from T-SQL code, table valued parameters must be passed as read only to stored procedures, functions etc. This means you cannot perform DML operations like Insert, update, delete on a table value parameter in the body of a function, stored procedure.
Step1: Create user-defined table type.
Create type emptabletype as Table
{
Id int primary key,
Name varchar2 (20);
}
Step2: Use the user-defined table type as a parameter in the stored procedure.
Create Proc spInsertemployee
@emptabletype emptabletype READONLY [passing a table type variable over a stored procedure]
AS
BEGIN
Insert into employees
Select * from @emptabletype
END
Step3: Declare table variable, insert the data rows and then pass the table variable as a parameter to the stored procedure.
Declare @employeetabletype emptabletype
Insert into @employeetabletype values(1, ‘parag’)
Insert into @employeetabletype values(2, ‘ravka’)
Execute spInsertemployee @employeetabletype
? Stored Procedure Concepts:
Stored Procedure:
A stored procedure is group of T-SQL (Transact SQL) statements. If you have a situation, where you write the same query over and over again, you can save that specific query as a stored procedure and call it just by its name.
Wrapping up the query in a stored procedure rather than writing it again.
CREATE PROCEDURE/PROC procedure_name
AS
BEGIN
Select c1 , c2 from tablename
END
after wrting it we can execute by:
"spproc_name"
"exec proc_name"
"exceute proc_name"
? Stored Procedure with parameters:
Create Proc nameofstoredproc
@gender nvarchar(20),
@departmentid int
as
Begin
Select Name, Gender , DepartmentId from tablename
where Gender = @gender , DepartmentId = @departmentid
End
after making the stored procedure we need to run likewise:
spprocedurename 'charvalue' , integervalue
order of parameters does matter unless we mention the parameter there.
sp_ (prefix) should not be used in user defined stored procedures because
generally sp_ prefix is for system stored procedures.
if we want to change/drop the stored procedure we can use "alter"/"drop"
ALTER PROCEDURE procedure_name
AS
BEGIN
select name,gender from tablename order by name
END
? Encrypting our stored procedure:
ALTER PROCEDURE procedure_name
WITH Encryption
AS
BEGIN
select name,gender from tablename order by name
END
? Stored procedure with output parameters:
Create Procedure procname
@gender nvarchar(20),
@empcount int output
as
Begin
Select @empcount = COUNT(id) from tablename
where Gender = @gender
End
? Executing a stored procedure with output parameter:
Declare @employeetotal int ( variable to store output)
Execute spprocname 'value1' , @employeetotal out/output ( to specify this is a output parameter)
Print @employeetotal , if don’t specify the output parameter giving keyword out/output then it show null as result even it has value in it.
Create proc spgetname
@Id int
@Name nvarchar2(20) output
Begin
return (select @name = name from tablename where id = @id)
or select @name = name from tablename where id = @id
end
? Checking definition of a system stored procedure:
sp_ (system stored procedure)
sp_help procedurename
sp_help tablename
sp_help databasename
? Executing the stored procedure:
Declare @Name nvarchar(20)
Execute proc_name 4, @name output
Print "output is" + @Name
? Differences between return status value/output parameters:
Input parameters : return status value - only integer datatype, only one value,
use to convey success or failure.
Output parameters - any datatype, more than value, use to return values like name, count etc.
? Advantages of using stored procedures over ad-hoc queries/inline SQL
Same execution plan is reused in stored procedure which is not the case in ad hoc ** execution plan here is the best optimum solution or plan a query must follow for better execution
Even if we made single space difference in the ad-hoc query, we will use the new execution plan which is not the case in stored procedures
Network traffic is increased in ad-hoc queries as we need execute the same statements again and again code reusability in stored procedure is too high.
? Optional parameters in Stored Procedure:
Here we are giving null as default values for parameters to make it optional
Create Proc spsearchemployee
@name nvarchar(50) = Null,
@email nvarchar(50) = Null
As
Begin
Select * from employee
Where (name =@name or @name IS NULL) AND
(email = @email or @email IS NULL)
End
After executing the above code, it will not throw an error.
Exceute spsearchemployee;
Parameters of a SQL server stored procedure can be made optional by specifying default values.
? Inline SQL:
Select * from (select -----) where -------;
Since inline sql is used we need to change them across all applications
? String functions:
lower(),upper(),ltrim(),rtrim(),reverse(),len(), left() , right()
charindex("expression to search" ,"full expression", startingposition)
substring(expression','start', 'length')
replicate(string_to_be_replicated, not of times to replicate)
select c1 + space(5) + c2 as aliasname from tablename
patindex =pattern index
patindex('%pattern%', Expression) { returns the starting position of the first occurence of a pattern in the specified expression. }
with charindex() we cannot use wildcards, whereas patindex() provides this capability, If the specified pattern is not found patindex() returns zero.
replace( expression , 'old_string', 'new_string')
stuff( orginial_expression, start, length , replacement_expression)
? Date functions:
GETDATE() - this function gives today's date
ISDATE() - checks if the given value is a valid date, time or datetime. returns 1 for success , 0 for failure.
Day() - returns the 'Day number of the month' of the given date
Month() - returns the 'Month number of the year' of the given date
Year() - returns the 'Year number' of the given date
for example : select year( getdate())
Datename( Day, '2012-09-30') ----- returns 30
Datename(weekday,'2012-09-30') ---- returns sunday
Datename(month, '2012-09-30') ---- returns september
datepart( weekeday, '2012-08-30') return 5
datedd( day/month/year , number , date)
datediff( day/month/year , startdate, enddate)
? CAST() and CONVERT() functions
CAST and CONVERT functions are for type casting:
CAST( column_name as desired_data_type(length))
CONVERT( desired_data_type(length) , column_name)
CONVERT( date, GETDATE())
ABS (-101.5) ,CEILING () ,FLoor() ,power() ,square() , sqrt(), round()
? User defined functions:
In SQL Server there are 3 types of user-defined functions:
-> Scalar functions
-> Inline table- valued functions
-> Multi-statement table-valued functions
? Scalar functions:
CREATE FUNCTION function_name (@parameter datatype)
Returns return_datatype
AS
BEGIN
function body
Return return_datatype
END
Exceution syntax:
Select database_owner.functionname (values) returns a scalar value after execution, scalar user defined functions in select syntax:
Select name , dateofbirth , dbo.age(dateofbirth) from tablename;
Select name,dateofbirth,dbo.age(dateofbirth) from tablename where dbo.age(dateofbirth) > 30;
Stored procedure also can accept dateofbirth and return age, but you cannot use stored procedures in a select or where clause.
Alter function functioname - alter a function
Drop function functioname - drop a function
? Inline table- valued functions:
Inline-table value function returns a table.
Create function func_name (@gender nvarchar(10)
returns table
as
return (select statement using parameters)
? Multi-statement table valued functions:
Create function function_name()
returns @table table (ID int, name varchar(20))
AS
Begin
Insert into @table
Select id , name from tablename
return
End
There is a begin and end block here and also we can structure our tables.
? Deterministic and Non- Deterministic function:
Always return the same result any time they are called with a specific set of values and given the same state of the database.
- square(), power(), sum(), avg(), count()
NON-Deterministic function:
May return diffrent result set each time they are called with a specific set of input values even if the database state remains the same.
getdate() and current_timestamp
We can also encrypt if we don’t want users to see our function details, we just need to add keywords "WITH ENCRYPTION".
? Global and Local Temporary tables: Temporary tables are very similar to the permanent tables, permanent tables get created in the database you specify and remain in the database permanently until we delete/drop them. On the other hand, temporary tables get created in the TempDB and are automatically deleted, when they are no longer used.
Local temporary table:
- Create table #tablename( ----)
- Insert into #tablename values (----)
Local temporary table is available only for the connection that has created the table. A local temporary table is automatically dropped, when the connection that has created it has closed.
If the user wants to explicitly drop the temporary table he can also DROP the table.
If the temporary table is created inside the stored procedure, it gets dropped automatically upon the completion of stored procedure execution.
It is also possible that different connection users create the local temporary table of same name.
Global temporary table:
To create a global temporary table we have prefix the name of the table with 2 pounds (##) symbols.
Create table ##tablename (ID int, name nvarchar (20))
Unique properties of global temporary tables are they are visible to all the connections of the SQL server, and are only destroyed when the last connection referencing the table is closed. Multiple users across multiple connections can have local temporary tables with the same name, but a global temporary table has to be unique.
Execution syntax:
Select database_owner.functionname (values) returns a scalar value.
? Indexes:
Indexes are used by queries to find data from tables more quickly; indexes are created on tables and views. Its act like the index similar to a book where it drastically help to reduce time .If there is no index to help the query, then the query engine, checks every row in the table from the beginning to the end. This is called table scan which hammers the performance. For example, if we are talking about salary of employees and there is a list of 10,000 employees it very difficult to retrieve information, so while creating index SQL server picks up the row addresses from the index and directly fetch records from the table rather than scanning each row in the table which is called index seek.
Create index ix_table-column
On tablename (column asc/desc)
2500 row address
3100 row address
4000 row address
5500 row address (example)
Sp_helpindex indexname ( it shows how the index is defined, its definition)
Drop index tablename.indexname ( dropping the index).
? Clustered Indexes:
A clustered index determines the physical order of data in a table. For this reason a table can have only one clustered index. Primary key constraint create clustered indexes automatically if no clustered index already exists on the table that means when if add data non-sequentially it will make it sequentially by the column/(s) we make clustered index from.It is quite analogous to a telephone directory, table can have only one clustered index but a composite index can contain multiple columns.
Syntax:
Create Clustered Index tablename_col1_col2
On tablename(col1 DESC, col2 ASC)
? Non-Clustered Indexes:
A non-clustered index is like an index in textbook. The data is stored in one place and the index in another. The index will have pointers to the storage location of the data. Since the non-clustered index is stored separately from the actual data, a table can have more than one non clustered index. In the index itself, the data is stored in an asc/desc order and it doesn’t influence storage of data.
Create NonClustered Index ix_table-column on tablename(columname);
? Difference between clustered and non-clustered index:
Only one clustered index per table and more than one as non-clustered index.
Clustered index is faster than non-clustered index, because the clustered index has to refer back to the table, if the selected column is not present in the index.
Clustered index determines the storage order of rows in the table, and hence doesn’t require additional disk space, but whereas a non-clustered index is stored separately from the table, additional storage space is required.
? Unique and Non-Unique Indexes:
Unique index is used enforce uniqueness of key values in the index. By default the primary key constraint creates a unique clustered index. Both clustered and non-clustered can be unique.
Unique constraint and unique index are the same, because when we add unique constraint, a unique index gets created behind the scenes.
If primary key itself is a unique index, it can’t be dropped.
Create Unique NonClustered Index
Uix_tablename_firstname_lastname
On tablename(firstname,lastname) [ If say I want unique combination of first and last name in my organization.
We if still want to delete it; we can do it from object explorer.
Alter table tablename
ADD CONSTRAINT tablename_columname
UNIQUE CLUSTERED(columnname)
Create a unique constraint when data integrity is considered.
A primary key constraint creates a unique clustered index, whereas UNIQUE constraint creates a unique non-clustered index.
A UNIQUE constraint or a UNIQUE index cannot be created on an existing able, if the table already contains duplicate values in the key columns. To solve this, remove the key columns from the index definition or delete/update the duplicate values.
? Advantages and Disadvantages of Indexes:
Advantages: Filter, delete and update gets really easy and quick while using indexes.
Disadvantages: If we are making a non-clustered index, it takes additional disk storage space. The amount of space depends on the index and the table.
If there are too many indexes and if there is an updation to one then all associated indexes should also be updated, which might eventually take more time.
Covering query: If all the columns you have requested are in the SELECT clause are present in the index, then there is no need to look up in the table again. The requested columns data can simply be returned from the index itself.
A clustered index can cover all the required columns from the table.
? Views in SQL Server:
A view is nothing but a saved SQL query, a view can be considered as a virtual table. Syntax is:
Create view view_name
As
Select col1, col2
From table1
Join table2
On table1.a = table2.b
While fetching data treat view same as a table.sql engine here treat table1, table2 as underlying base tables and then give data.
We can use ”sp_helptext view_name” to see the definition.
When we see the view it is not storing any data it is just a saved SQL query. It’s more of a virtual table. It can be used as a mechanism to provide row and column level security as it can make desired views restricting columns we don’t want to show and thereby giving access to the desired set of people.
We can also prevent detailed information and show only summarized or aggregated data.
To modify views: alter view statement
To drop views: drop view viewname
? Updatable Views:
Whenever we make changes i.e. update/insert/delete the underlying base table gets updated. If a view is based on multiple tables, and if you update the view, it may not update the underlying table correctly, to correctly update a view that is based on multiple tables, instead of triggers are used.
? Indexed Views:
A standard or non-indexed view is just a stored SQL query, when we try to retrieve data from the view; the data comes from underlying base tables. So, view is just a virtual table it does not store data.
However, when you create an index on a view, the view gets materialized; this mean view is now capable of storing data. In SQL server, we call them indexed views an in oracle materialized views.
Rules for making indexed views:
1) The view should be created with schema binding option
2) If an aggregate function is in the SELECT list and it references an expression and if there is a possibility for that expression to become NULL, then a replacement value should be specified.
3)if group by is used in the view then we should use Count_BIG(*) in the select expression.
4) When we are using base tables then it should be named in a two part fashion i.e. schemaname.tablename
Create view view_name
With Schemabinding
As
Select name,
Sum( isnull(( quantity* unitprices),0)) as totalsales,
Count_BIG(*) as total transactions
From table1
Join table2
On condition(joining)
Group by name;
Count Aggregate Function Return Int data type value .
COUNT_BIG() Aggregate Function Return BigInt data type value.
So if you Have Millions Record in your table and you need to Return the Count of this Record when you use the Count() Function it will Return Error but the Count_Big() Function it will Work with you Simply in this Case.
COUNT_BIG (DISTINCT expression) evaluates expression for each row in a group and returns the number of unique, non-null values. + COUNT_BIG is a deterministic function when used without the OVER and ORDER BY clauses.
? Limitations in Views:
· We cannot pass parameters to a view, but if we want to filter we can use where clause.
· We can use Inline table valued functions as a replacement to parameterized views.
Create function fnemployeedetails (@Gender nvarhar(20)) [ Creating function]
Returns table
As
Return
(Select id, name, gender, departmentid
From tbemployee where Gender = (@Gender)
Select * from dbo.fnemployeedetails(‘Male’)[ Method of invoking functions]
We should remember to use a two part function that is schemaname.functioname while invoking function.
· Since views don’t store data therefore rules and defaults cannot be associated with views.
· The order by expression is invalid in views unless TOP or FOR XML expression used.
· Cannot create views on temporary tables.
? Triggers in SQL Server:
“Event-condition-action rules”
When event occurs, check condition; if true, do action
There are 3 types of triggers basically: DML triggers, DDL triggers and Logon triggers.
Triggers are basically special type of stored procedures where it generates automatically.
After triggers, fires after the triggering action and Instead of triggers, fires instead of the triggering actions.
DML Triggers: they are fired automatically in response to DML events (INSERT, UPDATE & DELETE) that is they fire after this DML statement execution.
DML Triggers are classified into two:
è After triggers
è Instead of triggers
Create trigger tablename_actioname
On table name
FOR action name
As
BEGIN
Select * from inserted
END
Inserted table is a table which retains the copy of whatever has been inserted and it is valid only in the trigger scope, and if we try to access outside it throws an error. New data is present in the inserted table.
Create trigger tablename_actioname
On table name
FOR INSERT
As
BEGIN
Declare @id int
Select @id = id from inserted
Insert into table_in_which_we_want_to_insert
Values (----)
END
Deleted table is the table provided by SQL server to keep a copy for a row which got deleted.
Create trigger tablename_actioname
On table name
FOR DELETE
As
BEGIN
Declare @id int
Select @id = id from deleted
Insert into table_in_which_we_want_to_delete
Values (----)
END
If we want static text to an integer value, datetime format, we need to cast our integer to nvarchar().
After triggers: [Also known as FOR triggers]
Here deleted table contains old data and inserted table contains new/updated data when we update through triggers
Create trigger tablename_actioname
On table name
FOR UPDATE
As
BEGIN
Select * from deleted
Select * from inserted
END
Here we can push the data for example in audit table we are keeping a log with full statements, for that we are keeping variables here for storing old and new data.
ALTER trigger tablename_actioname
On table name
FOR UPDATE
As
BEGIN
Declare @id int [here it can’t be changed as it is primary key]
Declare @oldname nvarchar(20), @newname nvarchar(20)
Declare @oldsalary nvarchar(20), @newsalary nvarchar(20)
Declare @auditstring nvarchar(1000)
Select * from
Into #temptable [ local temporary table] [ it can happen that somebody updates
Data]
From inserted
While(Exists(Select Id from #temptable) [ if checks whether there is rows in the temptable or not ]
Set @auditstring = “ “
Select top 1 @id= id, @newname = newname,
@newsalary = salary
From #temptable
Select @oldname = oldname, @oldsalary = oldsalary
From deleted where id = @id
Set @auditstring = ‘ =======’
If conditions checking @oldname and @newname and updating audit string.
Insert into #temp table values(@auditstring)
Delete from #temp table where Id = @Id
We used loop because if for example we insert data on 4 Id’s loop helps to repeat the same.
Instead of triggers: (Insert)
Here Deleted table is always empty and the Inserted table contains the newly inserted data.
Views or functions are not updatable as its modification affects multiple base tables.
Create trigger trigger_name
On view/tablename
Instead of Insert (action name)
As
Begin
Select * from deleted
Select * from inserted
End
Here as we see Instead of trigger actually executes the trigger statements rather than insert statement we fire.
Create trigger trigger_name
On view/tablename
Instead of Insert (action name)
As
Begin
Select @deptid = deptid
From departmenttable
Join inserted
On inserted.deptname = departmenttable.deptname
If(@deptid is null)
Begin
Raiserror(“message”,severity level, status)
Return
End
Insert into tablename(columns)
Select columns from inserted
End
Instead of triggers: (Update)
Deleted table contains old data(before update) and inserted table contains NEW data(updated data)
Every time we see views or functions are not updatable if it is affecting multiple tables.
And also at times we are updating something and some other column gets updated, to avoid this issue, we can use instead of triggers.
Inserted table will contain the new data and deleted data stores old data before updating.
Here update function actually checks whether the column is updated or not.
Create trigger trigger name
On view/table name
Instead of Insert (action name)
As
Begin
If(Update(Id)) [update function checks whether the column is updated or not]
Begin
Raiserror(‘Id can’t be changed’, 16,1)
Return
End
If(Update(Deptname))
Begin
Declare @deptid = deptid
From tb1department
JOIN inserted
On inserted.deptname = tb1department.Deptname
If(@deptid is NULL)
Begin
Raiserror(‘dept invalid, 16,1)
Return
End
Update tb1employee set departmentid = @depitd
From inserted
Join tb1employee
On tb1employee.id = inserted.id
End
End
Instead of triggers: (Delete)
Inserted table is always empty and the Deleted table contains the rows deleted.
Create trigger trigger name
On view/table name
Instead of Delete (action name)
As
Begin
Delete tb1employee
From tb1employee
Join deleted
On tb1employee.id = deleted.id
End
? Derived table and Common Table Expressions (CTE) :
Temporary tables are stored in TempDB. Local temporary tables are visible only in the current session, and can be shared between nested stored procedure calls; global temporary tables are visible to other sessions and are destroyed when the last connection referencing the table is closed.
Table variables are just like temp tables which is also created in TempDB. The scope of a temp variable is the batch, stored procedure, or statement block in which it is declared. They can be passed as parameters between procedures.
Declare @tbcount table( deptname nvarchar(20), deptid int, totalemp int)
Here the @tbcount is a variable of table type.
Derived table: Here employeecount is the derived table.
Select deptname, totalemployees
From
( select deptname, count(*) as totalemployees
From
Conditions
)
as employeecount
Where totalemployees > = 2
? Common Table Expression (CTE):
With keyword is used to prefix the CTE name followed by returned columns.
It can be thought of a temporary result set that is defined with the execution scope of statements. It is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query.
A CTE can only be referenced by a SELECT, INSERT, UPDATE, or DELETE statement, that immediately follows the CTE.
Syntax:
WITH cte_name (column1, column2, …)
AS
( CTE_query)
2 combined CTE’s:
WITH cte_name (column1, column2, …)
AS
( CTE_query),
Cte2_name (column1, column2, …)
AS
(CTE_query)
Select statement
UNION
Select statement [ This select statement works on both the CTE]
? Updateable Common Table Expression (CTE):
With employee_name_gender
As
{
Select Id, name, gender from tb1.Employee
}
Update employee_name_gender
Set Gender = ‘Female’ where Id = 1
So, if a CTE is created on one base table, then it’s possible to UPDATE the CTE, which in turn will update the underlying base table.
In this case, UPDATING employee_name_gender CTE updates tb1employee table.
If a CTE is based on more than one table, and if the UPDATE affects only one base table, then the UPDATE is allowed, if more than two base tables then it is not allowed.
? Recursive Table Expression (CTE):
A CTE that references itself is a recursive CTE.
With
EmployeeCTE ( Empid, name, mgrid , [level])
As
(
Select empid , name, mrgid , 1
From tb1employee
Where mgrid is NULL
Union ALL
Select tb1employee.empid , tb1empoyee.name,
EmployeeCTE.[level] +1
From tb1employee
Join EmployeeCTE
ON tb1employee.managerID = EmployeeCTE.Empid
)
Select empCTE.name, ISNULL(mgrCTE.name , ‘boss’)
,empCTE.[level]
From Employee empCTE
Left join EmployeeCTE mgrCTE
ON joining conditions.
? Pivot Operator in SQL Server:
Pivot is a SQL server operator that can be used to turn unique values from one column into multiple columns in the output then by effectively rotating a table and used for cross tab functionality.
Select salesagent, IND, US, UK
From tb1productsales
PIVOT
(
SUM(salesamount)
FOR Salescountry
IN( [India] , [US] , [UK])
)
AS PivotTable
{This will give a cross tab result and this values can be made as columns by using Square brackets but by this it also gives additional columns present in the table which can be avoided by a derived table.}
Select salesagent, IND, US, UK
From
(
Select Salesagent, salescountry, salesamount
From tb1productssale
) as sourcetable
[ here source table is a derived table which selected only required columns , not unwanted columns]
PIVOT
(
SUM(salesamount)
FOR Salescountry
IN( [India] , [US] , [UK])
)
AS PivotTable
? Error Handling in SQL Server: [ Try, catch and throw statements ]
Yes, there is. The TRY/CATCH construct is the structured mechanism for error handling in SQL Server 2005 and later. This construct has two parts; we can try executing some statements in TRY block and handling errors in the CATCH block if they occur. Therefore, the simplest error handling structure can be like this:
- TRY : Try executing statements
- CATCH : Handle the errors if they occur
· Will all statements in TRY block try to execute?
When executing statements in the TRY block, if an error occurs the flow of execution will transfer to the CATCH block. So the answer is NO!
SET NOCOUNT ON;
BEGIN TRY -- Start to try executing statements
PRINT 'Before Error!' -- Statement no1
SELECT 1 / 0; -- Statement no2
PRINT 'After Error!' -- Statement no3
END TRY -- End of trying to execute statements
BEGIN CATCH -- Start to Handle the error if occurs
PRINT 'Error occurs!' /* Handle the error */
END CATCH -- End of Handling the error if occurred
Does the CATCH part automatically handle the errors?
No. The role of the TRY/CATCH construct is just providing a mechanism to try executing SQL statements. Therefore, we need to use another construct or statements to handle the errors in the CATCH block that I explain later. For instance, the following code will try to execute a divide by zero statement. It does not automatically handle any errors. In fact, in this sample code, when an error occurs the flow control immediately transfers to the CATCH block, but in the CATCH block we do not have any statement to tell us that there was an error!
SET NOCOUNT ON;
BEGIN TRY -- Start to try executing statements
SELECT 1 / 0; -- Statement
END TRY -- End of trying to execute statements
BEGIN CATCH -- Start to Handle the error if occurs
END CATCH -- End of Handling the error if occurred
--result
In the CATCH block we can handle the error and send the error message to the application. So we need an element to show what error occurs. This element is RAISERROR. So the error handling structure could be like this:
TRY
Try executing statements
CATCH
Handle the error if occurs
RAISERROR
? Transactions in SQL Server:
A transaction is a group of commands that change the data stored in a database. A transaction is treated as a single unit. A transaction ensures that, either all of the commands succeed, or none of them. If one of the commands in the transaction fails, all of the commands fail, and any data that was modified in the database is rolled back.
Steps for transaction processing:
1) Begin a transaction
2) Process database commands.
3) Check for errors.
If errors occurred,
Rollback the transaction,
Else,
Commit the transaction
If we perform a simple update statement which is not a part of any transaction, changes are made permanently here:
Update product_table set qty = 200 where productid = 1
By default, SQL server isolation level is read committed, that is read only committed data
“SET TRANSACTION ISOLATION LEVEL READ UNCOMMTTED”
Whenever the update statement in within the transaction, the changes are not made immediately, unless we write commit statement. If we want to undo the changes we can roll back the transaction.
Create Procedure sp_update_address
As
Begin
Begin Try
Begin Transaction
Update statement1
Update statement2
Commit Transaction
Print “Transaction committed”
End Try
Begin catch
Rollback transaction
End catch
End
? ACID properties in transactions in SQL Server:
Atomic: All statements in the transaction either completed successfully or they were all rolled back. The task that the set of the operations represents is either accomplished or not, but in any case not left half done.
Consistency: All data touched by the transaction is left in a logically consistent state, For example, if stock available numbers are decremented from one table, then there has to be a related entry in the connected or associate table.
Isolated: the transaction must affect data without interfering with other concurrent transactions, or being interfered with by them, This prevents transactions from making changes to data based on uncommitted information, for example changes to a record that are subsequently rolled back, Most databases use locking to maintain transaction isolation.
It means if two users are trying to use one table, it should not allow another person while one person is already using it, and it should be locked if another transaction is processed. Locking is used to perform transaction isolation.
Durable: Once a change is made, it is permanent. If a system error or power failure occurs before a set of commands is complete, those commands are undone and the data is restored to its original state once the system begins running again.
? Subqueries in SQL Server:
1) Select id, name
From products
Where id not in (select distinct prod_id from sales)
2) Select name, (select sum(qty) from products where prodid = 2) as qtysold
From tablename;
Sub query is simply a select statement that returns a single value and can be nested inside a Select, Update, Insert or Delete statement. It is also possible to nest a sub query inside another sub query. They are always enclosed in parenthesis and are also called as inner queries, and the query containing the sub query is called as outer query. The columns from a table that is present only inside a sub query, cannot be used in the SELECT list of the outer query.
? Co-rrelated Sub queries in SQL Server:
If the sub query depends on the outer query for its values, than that sub query is called as correlated sub query.
In the where clause of the sub query below “ProductId” column get its value from tb1products table that is present in the outer query.
So, here the subquery is dependent on the outer query for its value, hence this subquery is a correlated subquery.
Select [Name],
(Select SUM(quantitysold) from tb1productsales
Where productId = tb1products.Id) as Totalquantity
From tb1products
Order by name
? Join vs subquery which is good based on performance?
In general joins work faster than sub queries, but in reality it all depends on the execution plan that is generated by the SQL server. Also depends whether it deleted the cache or not. There are options for “include actual execution plan” and “include client statistics”, we should consider that before making decision.
? Cursors in SQL Server:
RDBMS works great with sets, if we want a row by row processing then cursors be used. It is a pointer to a row.
Declare @ProductId int
Declare @Name nvarchar(30)
Declare ProductCursor FOR
Select id, name from tb1products where id <= 1000
Open ProductCursor
Fetch Next from ProductCursor into @ProductId, @Name
(This will fetch the first record; the data of the first row will be retrieved in the above variables)
While(@@fetch_status = 0) [ this will fetch 0 as far as rows are present to process]
Begin
Print”message”
Fetch Next from ProductCursor into @ProductId, @Name
End
Close ProductCursor
Deallocate ProductCursor [ this basically deallocates the resources used by the cursor]
? Replacing Cursors with joins:
We can replace it with joins by using case when statements which are same as switch case statements.
? Listing all the tables in SQL Server by Transact-SQL:
Object explorer in SSMS can be used to get the list of tables in a specific database, by query I can achieve by:
Select * from Sys.tables
Select * from Information_schema.tables
Select * from Information_schema.views
Also, to get the list of different object types (XTYPE) in a database
IT- internal table
P – Stored Procedure
PK – Primary key constraint
S – System table
U – User table
V – View
Select * from sysobjects where XTYPE = “U”
Select distinct Xtype from sysobjects [gives us different type of objects]
? Re-runnable SQL Server Scripts:
A Re-runnable script is a script that, when run more than once, will not throw errors.
For making this below script re-runnable we:
1) Check for the existence of the table.
2) Create the table if it does not exist
3) Else print a message stating, the table already exists.
n THIS script is not re-runnable
Use [Sample]
Create table employee
{
Id int indentity primary key,
Name varchar(100) }
n My script re-runnable
Use [Sample]
If not exists ( Select * from information_schema_tables where table_name = ‘employee’
Begin
Create table employee
{
Id int indentity primary key,
Name varchar(100)
}
Print ‘table employee successfully created’
End
Else
Begin
Print ‘table employee already exists’
End
Checking for existence:
We can also check for “object_id” if it’s null or not.
Dropping if exists:
Drop table employee.
Also check for column name and schema name:
If not exists( Select * from Information_schema.Columns where Column_name = ‘emailaddress’ and table_name = ‘tb1employee’ and table_schema = ‘dbo’ )
? Merge in SQL Server:
Merge statement introduced in SQL server 2008 allows us to perform inserts, updates and deletes in one statement. This means we are no longer have to use multiple statements for performing insert, update and delete.
It requires 2 tables here:
1) Source table: - Contains the changes that need to be applied to the target table.
2) Target table: - The table that requires changes. (insert, update and delete)
MERGE student_target as T
USING student_source as S
ON T.ID = S.ID
WHEN MATCHED THEN
UPDATE SET T.NAME = S.NAME
WHEN NOT MATCHED BY TARGET THEN
INSERT (ID, NAME) VALUES (S.ID, S.NAME)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
{Here by target means which are present in source and not present in target table}
{Here by source means which are not present in source and present in target table then delete that record}
? Concurrent transactions:
A transaction is a group of commands that change the data stored in a database. A transaction is treated as a single unit of work.
Begin TRY
Begin Transaction
Update Accounts SET Balance = Balance – 100 where id = 1
Update Accounts SET Balance = Balance – 100 where id = 2
COMMIT TRANSACTION
PRINT ‘Transaction committed’
End TRY
Begin CATCH
ROLLBACK TRANSACTION
PRINT ‘Transaction rolled back’
END CATCH
There are many concurrent problems when two or more transactions are working on same data at the same time.
- Dirty reads
- Lost Updates
- Non repeatable reads
- Phantom reads
SQL server transaction isolation levels
- Read uncommitted
- Read committed
- Repeatable Read
- Snapshot
- Serializable
Allowing one use to perform one transaction at any given point of time.
Transaction isolation levels are provided by SQL server to avoid the above problem.
? Dirty read example:
A dirty read happens when one transaction is permitted to read data that has been modified by another transaction that has not yet been committed. In most cases this would not cause a problem. However, if the transaction is rolled back after the second reads the data; the second transaction has dirty data that does not exist anymore.
If the isolation level is read committed another transaction can’t read uncommitted data
For example transaction1 has set the value of 9 and then transaction2 reads 9 since it has isolation level read uncommitted but if afterwards transaction1 roll backed setting the value to 10, the situation for transaction2 is dirty read.
Read uncommitted transaction isolation level is the only isolation level that has dirty read side effect. This is the least restrictive of all the isolation levels. When this transaction isolation level is set, it is possible to read uncommitted or dirty data.
Another option is to read dirty data is by using NOLOCK table hint.
Select * from tbtable (NOLOCK) where Id = 1;
? Lost update problem:
Lost update problem happens when 2 transactions read and update the same data.
We can overcome this problem by setting up isolation level as Repeatable Read,
Set Transaction Isolation level Repeatable Read
When we set this to both transactions, transaction1 will execute first and when that is executed and we start transaction2 it gives the below error:
Transaction was deadlocked on lock resources with another process and has been chosen as the deadlock victim, rerun the transaction.
This ensures data which is read and modified by transaction1 can’t be read or modified by any other transaction.
? Not repeatable read example:
Not repeatable read happens when one transaction reads the same data twice and another transaction updates that data in between the first and the second read of transaction one.
To fix the non-repeatable read problem, set transaction isolation level of Transaction1 to repeatable read, this will ensure that the data that Transaction1 has read, will be prevented from being updated or deleted elsewhere, this solves the non-repeatable read problem.
Repeated Read, snapshot, serializable are higher isolation level which doesn’t have this problems.
? Phantom reads example:
Basically, Phantom read happens when one transaction executes a query twice and it gets a different number of rows in the result set each time. This happens when a second transaction inserts a new row that matches the WHERE clause of the query executed by the first transaction.
Read 1: select * from employee where id between 10 and 20 {count here is 2}
Individual who read is doing another work and in between another person insert a row which increases the count to 3.
Read 2: select * from employee where id between 10 and 20 {count here is 3}
This can be corrected by setting isolation level serializable
Set Transaction isolation level serializable
“This will place a range lock on the rows between 10 and 20 here, which prevents any other transaction from inserting new rows within that range.”
? Repeatable Read v/s Serializable:
Repeatable read prevents only non-repeatable read, this isolation level ensures that the data that one transaction has read, will be prevented from being updated or deleted by any other transactions resulting in phantom read concurrency problem.
Serializable prevents both non-repeatable read and phantom read problems. It ensures that the data that one transaction has read will be prevented by updated, deleted and inserted rows from other transactions so it prevents both non repeatable read and phantom read questions.
? Snapshot Isolation level:
or enabling snapshot level over the database we use:
Alter database database_name
SET ALLOW_SNAPSHOT ISOLATION ON
Snapshot uses versioning means if there are two transactions and transaction1 has occurred, transaction2 if using the same data will get the previous value.
? Read committed Snapshot Isolation level:
Default behavior of read committed isolation level is that if one transaction is performing then it doesn’t allow the other transaction to read.
Alter database database_name
SET READ_COMMITTED_SNAPSHOT ON
? Read committed snapshot V/s Snapshot Isolation level:
1) In snapshot, if the other transaction is trying to use the same data while one transaction is in progress then it will block it and will not allow it, which gives update conflicts error.
In read committed, if the other transaction is trying to use the same data while one transaction is in progress then it will block it and will not allow it, but after the first transaction is committed, it will perform the update of transaction2 at the same time, no update conflicts.
2) In snapshot, it provides transaction-level read consistency. Whereas read committed provides statement level read consistency.
? SQL Server Deadlock concepts:
A deadlock occurs when two or more processes have a resource locked, and each process requests a lock on the resource that another process has already locked, neither of the transactions here can move forward, as each one is waiting for the other to release the lock.
When a deadlock occurs, SQL server will choose one of the processes as the deadlock victim and rollback that process, so the other process can move forward.
Deadlock victim: When a deadlock is detected, the database engine ends the deadlock by choosing one of the threads as the deadlock victim. The deadlock victim’s transaction is then rolled back and returns a 1205 error to the application. Rolling back the transaction of the deadlock victim releases all locks held by that transaction. This allows the other transactions to become unblocked and move forward.
Deadlock priority: SQL server chooses a transaction as the deadlock victim that is least expensive to roll back. However, a user can specify the priority of sessions in a deadlock situation using the SETDEADLOCK_PRIORITY statement. The session with the lowest deadlock priority is chosen as the deadlock victim.
SET DEADLOCK_PRIORITY NORMAL ; by default it is normal.
Deadlock victim selection criteria:
If the priority is different the lowest priority is considered, if both are having same priority then least expensive is roll backed, if both have same priority and the same cost, then it is chosen randomly.
Logging deadlocks: When deadlocks occur, SQL Server chooses one of the transactions as the deadlock victim rolls it back. There are several ways in SQL Server to track down the queries that are causing deadlocks. One of the options is to use SQL Server trace flag 1222 to write the deadlock information to the SQL Server error log.
We can use the startup parameters or DBCC commands.
-1 parameters indicates Trace flag must be set at the global level. If you omit-1 parameter the trace flag will be set only at the session level.
DBCC Traceon(1222, -1)
Deadlocks analysis and prevention:
Deadlock victim: contains the ID of the process that was selected as the deadlock victim and killed by SQL server.
Process List: Contains the list of the processes that participated in the deadlock.
Resource list: Contains the list of the resources (database objects) owned by the processes involved in the deadlock.
We can prevent this by changing the order of the transaction.
Deadlock error handling:
1205 – Deadlock error, we can put in the catch block and rollback the operation.