A few thoughts on Microsoft SQL Server Always On Availability Groups
First, Always On is not a product/feature. This terminology was first mentioned when Database Mirroring was introduced in 2005 and it referred primarily to the suite of features for HA & DR. “Always On” has been used interchangeably with the concept of Transparent Failover, offered first by SQL Cluster since version SQL 6.0 in 1997 (arghh), and then in 2005 by Database Mirroring in synch mode with a witness.
DBM fails over faster than FCI (~5 seconds compared to ~30 seconds depending on how much recovery takes place). In other words, arguably, neither technology is “Always On”
When I was still working for Microsoft in 2010, we had awesome plans to marry DBM with Log Shipping and spice up the relationship by offering a finer granularity of failover (compared to SQL Cluster that fails ALL databases at once). Remember that DBM was never READABLE, so Availability Group was about to happen. The name HADRON was chosen.
HA for High Availability, DR for Disaster Recovery and ON for ONLINE, but in March of 2010, the real HADRON collider near Geneva made news first by circulating beams of protons at 3.5 trillion electron volts in both directions, so our code name “Denali” which was changed to SQL 2011 before it slipped to SQL 2012 withdrew the nickname HADRON and we were back to Always ON (in case you did not know it)
Notice that this paper has more than a dozen links to Microsoft Official Documentation and Blogs on the subject.
This paper assumes that you know the basic concepts of Availability Groups, including the enabling of this feature in Configuration Manager.
“Do not fight simplicity, when a simple solution is an adequate solution” FA
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
First, KUDOs to Microsoft for making AG in SQL 2017 several times faster than it was in SQL 2012 and SQL 2014 on the same hardware. One of the major improvements is the multithreaded parallel redo.
Ok,
At this point you have read all about Availability Groups, you are feeling invincible and you have a grandiose plan in mind: WSFC spanning subnets (aka stretch cluster), using FCI in your topology (even though it does not support automatic failover by AGs) and you also want read-only routing on your replicas, etc.
Good. That’s doable.
Let’s consider a few things first. Please read this link about FCI on multi subnets and also how in Multi-Subnet Windows Server Failover Clusters, introduced in Windows 2012, there is a big difference between a multi-subnet cluster and a multi-subnet cluster that can withstand a network failure correctly as explained in this blog
It is a great concept and looks similar to this picture
This article will turn 180 degrees from the complexity mentioned above and I will explain how we crawl before we run.
Starting with SQL Server 2017, we can deploy availability groups, referred as AGs going forward, without a cluster (WSFC) topology and you can even configure AGs for SQL running on Linux.
Simplest Scenario:
Two instances of SQL STANDARD edition (yes standard) and no WSFC cluster or FCI.
Strings Attached: Cannot read from secondary, cannot backup from secondary, limited to 2 replicas including primary, just one database, no auto page repair and other limitations. Still… You have a DR solution out of the box that is a big improvement over the deprecated Database Mirroring and no cluster is required.
Now, let’s take it to the next level and use SQL Enterprise edition (without any clusters involved)
This is a major upgrade in features:
Up to 8 secondary replicas - including 2 synchronous secondary replicas, you have “reading” options on the secondary (Read & Read-Intent only), you can backup from secondary, etc.
This is a full DR solution, since you can have a synchronized copy of your data in a different location, and you can read this data, report from it 24/7, unlike Log Shipping that only allow you to read in between restores.
A word about Synch versus Asynch. There is a difference in how the log flows. Choose Asynch first and monitor it and change to Synch after a diligent test. You can always change from Asynch to Synch when you have a planned failover, wait a few seconds for it to catch up and once the state changes from SYNCHRONIZING to SYNCHRONIZED, then you FAILOVER and again, unlike log shipping, you do not need the tail of the log, run the copy and then the restore job one last time. No data is lost. Do not forget to run the ALTER DATABASE [dbname] SET HADR RESUME
I wrote a script that you can use to perform the failover in appendix A (use it at your own risk lol)
When you first deploy this in Synch mode, during business hours, it might be fine, but then when you reindex your tables overnight and create 50GBs of transaction logs in a few minutes; it may be a different story. Hey… you can always create a job to change from Synch to Asynch before your maintenance window and then reset it to Synch. Why not?
Attention! Think twice before you change the default option for REQUIRED_SYNCHRONIZED_SECONDARIES_TO_COMMIT because your primary replica will not be able to commit until the issue is resolved and then, no HA, no DR and users will be calling nonstop… Yikes!
A word about READING from the secondary. Remember that when you read data, you also place locks on the data that you read (shared locks), so to avoid block contention, all read requests in the secondary is mapped to “snapshot” isolation transaction level, even when other transaction isolation levels are explicitly set. Also, all locking hints are ignored e.g. NOLOCK. This will add a 14-byte overhead to each data row because of row versioning. Keep an eye on TEMPDB.
A word about seeding (initializing) the secondary. If you are going to let SQL seed the replica for you on a large database, consider using compression to decrease the transfer time. This will put more load on the CPU. Enable the Trace Flag 9567. I used a 40GB database and it was 2.2 times faster (95 seconds compared to 211 seconds). You can monitor the seeding progress by querying the DMVs sys.dm_hadr_automatic_seeding and sys.dm_hadr_physical_seeding_stats
We talked about AGs using Enterprise, but we haven’t covered AUTOMATIC failover yet. For that you need a Windows Cluster (WSFC). No SQL cluster (FCI) is required. Actually FCI does not support automatic failover by AGs
Note that you can still do a Round Robin/Load Balancing and hide the name of your instances using a Fake Listener with the help of DNS without any clusters involved.
If you are deploying AGs on WSFC, the dropdown for AUTOMATIC failover becomes available and you can choose it and you can have HADRON without FCI, without Log Shipping and without DBM(deprecated). You can configure your LISTENER and your application can connect to the Listener oblivious to the location where SQL Server is running.
There is much to say about failing your AGs to another site/subnet automatically if your application servers are not configured to move along with it, but you can always rely on DNS changes to redirect your incoming connections to the right place. Remember that you can have a replica in any location where your server can talk to. To another site, state, even to Kalamazoo (asynchronously please)
You can also choose to have your AGs spanning WSFCs without using subnets, VLANs, etc. You can send your AG from one WSFC cluster to another WSFC cluster in another building by choosing “External Cluster”, but you will not be able to failover automatically, however you can even choose to have a SYNCHRONOUS copy of your AGs spanning WSFCs. This is way cheaper than SAN replication.
Additional thoughts
Make sure that the database has the same owner in the replica as it does in the primary replica if you used automatic seeding. Use ALTER AUTHORIZATION ON DATABASE::DBname TO [Newoner];
You can right-click the Always On High Availability Folder + Show DashBoard
Send emails when an alert triggers for the errors listed in the section “error numbers to filter” and include errors 35264, 35273-35276, and 41418
Like any other issue, start with Error Logs & Event Logs from both Primary & Secondary replicas. There are extended events for AGs.
Look at the tables msdb..suspect_pages and sys.dm_hadr_auto_page_repair
You can monitor the performance / health of your AGs by using the DMVs
Look at sys.dm_hadr_availability_replica_states with sys.dm_os_performance_counters, also look in sys.dm_os_wait_stats for wait_type like ‘HADR%”
Bonus topic:
Who said READ ONLY for the Secondary?
Remember that in SQL Replication that some subscribers needed to do an occasional update and after thousands of customer requests Microsoft released “Immediate Updating Subscribers” and “Queued Subscribers”?
For an occasional update you can still configure a Linked Server with a few clicks (3 or 4 at the most) and you can INSERT/UPDATE/DELETE data and run stored procedures from your SECONDARY by using 4-part name (including the name of the primary server & database) without the mess of DTC & Message Queue, etc.
Look, I am in the secondary, I select the max value of my LOCAL table, Insert a record in the primary table from my READONLY replica and the record travels back to me in a sub-second fashion:
Thanks for Reading!
Flavio Almeida
Appendix A
Script to execute a manual failover
-- RUN the following statements on the PRIMARY AG FIRST. This is a planned FAILOVER
ALTER AVAILABILITY GROUP NOClusterAG MODIFY REPLICA ON 'TSQLXXXXX\SQL2017' WITH (AVAILABILITY_MODE = SYNCHRONOUS_COMMIT);
ALTER AVAILABILITY GROUP NOClusterAG MODIFY REPLICA ON 'TSQLXXXXX\SQL2017' WITH (AVAILABILITY_MODE = SYNCHRONOUS_COMMIT);
-- Now run the following on the SECONDARY replica that will take over
use master
go
declare @PrimaryReplica varchar(50)
declare @NewPrimary SYSNAME
declare @NewPrimaryState VARCHAR(50)
-- get name of the primary replica and store it in @PrimaryReplica -- Will return NULL when running on Secondary
select @PrimaryReplica = isnull(R.replica_server_name, 'Not Primary')
from sys.availability_replicas R
JOIN sys.dm_hadr_database_replica_states S ON R.replica_id = s.replica_id
where S.is_primary_replica = 1
-- get state of the new secondary and store it in @NewPrimaryState
select @NewPrimaryState = synchronization_state_desc
from sys.dm_hadr_database_replica_states S
JOIN sys.availability_replicas R ON S.replica_id = R.replica_id
AND replica_server_name = 'TSQLXXXX\SQL2017' -- this is the name of the Future Primary instance
-- This SCRIPT will fail my availability group called NOClusterAG from the primary replica to a secondary called 'TSQLXXXX\SQL2017'
IF @PrimaryReplica = @@servername
-- Make sure I am running this on the Primary Replica
BEGIN
SELECT 'Now You connected to '+ @@servername
SELECT 'Please commence the manual FAILOVER from your Secondary Replica '
END
ELSE
BEGIN
SELECT 'Commencing Failover . . . '
WHILE @NewPrimaryState <> 'SYNCHRONIZED'
BEGIN
WAITFOR DELAY '00:00:02' -- wait 2 seconds
--Retrieve the state of the soon to be new Primary
select @NewPrimaryState = synchronization_state_desc
from sys.dm_hadr_database_replica_states S
JOIN sys.availability_replicas R ON S.replica_id = R.replica_id
AND replica_server_name = 'TSQLXXXXX\SQL2017'
END
-- Now we are synchronized and ready to dail without any data loss
ALTER AVAILABILITY GROUP NOClusterAG FAILOVER;
WAITFOR DELAY '00:00:02' -- Waiting 2 seconds just because...
-- RESUME synchronization in databases
ALTER DATABASE [AdventureWorks2017] SET HADR RESUME --
-- revert back to asynch
ALTER AVAILABILITY GROUP NOClusterAG MODIFY REPLICA ON 'TSQLXXXXXX\SQL2017' WITH (AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT);
ALTER AVAILABILITY GROUP NOClusterAG MODIFY REPLICA ON 'TSQLXXXXXX\SQL2017' WITH AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT);
SELECT 'Failover Successful . . . '
END
Senior Database Consultant at NSW Department of Customer Service
6 年Very informative
Director, Support Engineering Azure: Linux | Integrated Support | Azure Strategic Workloads
6 年Extremely helpful. Thank you for posting.