How to Monitor Database Availability Groups?
Stellar Information Technology Pvt. Ltd.
Global Data Care Experts #1 in India since 1993
Database Availability Group (DAG) is the core component of the Microsoft Exchange Mailbox Server that provides high availability and site resilience with continuous replication and failover clustering. A DAG can have up to 16 Exchange Servers in a cluster to host a set of database copies and provide automatic database-level recovery when the disaster strikes.
Although DAG member servers continuously monitor each other for database, disk, server, or network failure, it is also critical for the administrators to keep an active eye on the database copies and member servers' health.
Importance of Monitoring Database Availability Groups (DAG)
Monitoring the DAG member server for replication health, database copies, low disk space, etc., is important to ensure DAG continues to work, performs failover, and activates the passive mailbox copy in the event of server or database failure without any issues.
Failing to monitor the DAG member servers can lead to the following failures and prevent DAG from providing automatic database recovery that can disrupt the services and cause downtime.
Steps to Monitor Database Availability Group (DAG)
Use CollectOverMetrics.ps1 PowerShell script to read DAG member event logs and gather the information related to database operations.
Below, we have discussed steps to check and monitor Database Availability Group (DAG) health to ensure high availability, site resilience, and avoid DAG failure.
Step 1: Assign Required Roles and Permissions
You must assign the following roles required to run the PowerShell cmdlets for monitoring the DAG status.
To assign the roles required, use New-ManagementRoleAssignment cmdlet in EMS:
New-ManagementRoleAssignment –role <role-name> -user <username>
For example,
New-ManagementRoleAssignment –role "View-only Configuration" -user "administrator"
Once the roles are assigned, you can use the PowerShell cmdlets discussed in the next step to monitor the Database Availability Group status.
Step 2: Check DAG Status
To monitor Database Availability Group status, you can use PowerShell cmdlets, such as Get-MailboxDatabaseCopyStatus and Test-ReplicationHealth in Exchange Management Shell (EMS).
These cmdlets help you monitor database copy status for a particular database or all database copies on a specific server.
Use Get-MailboxDatabaseCopyStatus Cmdlet for Monitoring DAG Database Copy Status
Get-MailboxDatabaseCopyStatus -Identity MBXDB01 | Format-List
Get-MailboxDatabaseCopyStatus -Server EXCH01 | Format-List
Get-MailboxDatabaseCopyStatus -Local | Format-List
(Get-DatabaseAvailabilityGroup) | ForEach {$_.Servers | ForEach {Get-MailboxDatabaseCopyStatus -Server $_}}
领英推荐
Use Test-ReplicationHealth Cmdlet for Monitoring DAG Continuous Replication Health Status
Test-ReplicationHealth helps administrators monitor the continuous replication health and replay status of all DAG member servers. It also helps them perform other tests to monitor the quorum, cluster service, and network components' health status.
Test-ReplicationHealth -Identity EXCH01
If everything in your DAG environment is working, it should display results as Passed.
(Get-DatabaseAvailabilityGroup) | ForEach {$_.Servers | ForEach {Test-ReplicationHealth -Server $_}}
Step 3: Customize Low Disk Space Threshold
Starting from Exchange 2013 SP1, only the volumes storing the database and logs are monitored by the DAG. By default, the low disk space volume monitor threshold in Exchange Server is set to 180 GB. However, you can increase or decrease the threshold value as per your organizations' needs to monitor the disk space usage by adding the DWORD registry value in the Exchange Server Registry key.
The steps are as follows:
Step 4: Use CollectOverMetrics.ps1 Script
CollectOverMetrics.ps1 is a PowerShell script that you can use to collect the metrics for databases in DAG. The script is located in the Scripts folder.
The script reads the DAG member servers' event logs to gather the information related to database operations, such as database failovers, mounts, and moves. It stores the information in a CSV file displaying one operation per row. A separate CSV file is created for each DAG member.
.\CollectOverMetrics.ps1 -DatabaseAvailabilityGroup DAG1
CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -Database MailboxDatabase123,MailboxDatabase456
In case of error, such as script is not digitally signed, you can temporarily bypass the execution policy by executing the following command in the PowerShell:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
Further, you may also download and run the Get-DAGHealth.ps1 script (for Exchange 2010/2013 DAG only). This is a script created by MVPs that runs a series of health checks on the DAG and generates a more detailed report. Use the following command to execute the script and collect data.
.\Get-DAGHealth.ps1 -Detailed
Step 5: Analyze Crimson Channel Log Events
Exchange Server stores the log events in Crimson Channels located under Applications and Services logs.
You should also look into the Crimson Channel logs to check and monitor Microsoft Exchange Replication Services status, such as Active Manager, Volume Shadow Copy Service (VSS) writer, TCP listener, etc.
The steps are as follows:
o??High Availability: It contains the events related to Microsoft Exchange Replication Service and its components startup and shutdown information. You can also fetch info on the database mount operation and log truncation associated with DAG.
o??MailboxDatabaseFailureItems: This crimson channel contains the log events with failures that impact a database replica.
To Wrap Up
By following the steps discussed in this article, you can efficiently monitor your Database Availability Group (DAG) and keep a check on the member servers, database copies, storage space, and critical Exchange services. However, you should always maintain a regular VSS backup, even after deploying DAG. DAG is not an alternative to backup as it only provides recovery against database-level failure. If a disaster strikes, you can use the backup to restore mailboxes to a new server. You may also use Stellar Repair for Exchange to recover mailboxes if the DAG member server crashes or DAG stops working due to some critical failure and backup isn't available or obsolete. It can help you repair the database, recover mailboxes, and restore them to a live Exchange Server or Office 365 tenant.