Since Microsoft first introduced the Always On Availability Groups (AGs) feature in SQL Server 2012, there’s been a lot of interest in using AGs for both high availability and disaster recovery (HADR), as well as for offloading read-only workloads. The combination of the best features for failover clustering, the simplicity of data movement and synchronization from database mirroring, and the ability to offload read-only workloads to secondaries has given businesses a compelling reason to upgrade to leverage AGs.
But, as the saying goes, there’s no such thing as a free lunch, and there are several performance implications and considerations you must be aware of to have a successful deployment using AGs. This blog post will explore some of the considerations and look at how to plan, architect, and implement an AG with minimal latency and performance impact on the production workload.
SQL Server Version Matters
In SQL Server 2012, AGs provided an easy-to-use interface for configuration that made them an instant success. SQL Server Management Studio’s tooling was built around checking everything that could go wrong and then making the necessary database backups for automatically seeding the databases onto the secondary replica servers via BACKUP and RESTORE operations. All the challenges we used to encounter with configuring database mirroring were, for the most part, handled by SQL Server.
However, one drawback in SQL Server 2012 was data synchronization was built on top of the database mirroring implementation—there were still limitations on throughput that impact systems with a high volume of data change. On SQL Server 2012, the throughput drop for some workloads could be as high as 50% as soon as a synchronous-commit replica was enabled in an AG. With AGs providing the HADR implementation behind the scenes for Azure SQL Database, the impact on performance was apparent. Microsoft product support did a lot of work in SQL Server 2016 to address throughput issues and reduce the “chatter” between replicas for critical tasks (like flow control) to improve performance.
SQL Server 2016 had a goal of 95% throughput performance for a standalone installation of SQL Server when a synchronous-commit secondary was enabled, which opened up the use of AGs for high-volume transactional workloads. This means that if you’re running an earlier release of SQL Server with an AG, upgrading to SQL Server 2016 should be a priority.
Synchronous vs. Asynchronous Commit
One of the biggest performance considerations with AGs is whether the configuration is using synchronous-commit mode for one or more replicas. Each AG can have a primary (read-write) replica and up to two synchronous-commit secondary replicas, allowing for automatic failover and zero-data loss configurations. SQL Server 2012 supports up to four secondary replicas, and SQL Server 2014 and higher support up to eight, but only two can be configured as synchronous-commit secondary replicas.
When a secondary replica is configured for synchronous commit, the following happens before a transaction can commit on the primary replica:
- The log records for the transaction must be sent from the primary replica to the secondary replica
- The log records are hardened in the secondary replica’s transaction log
- An acknowledgement message is sent back to the primary allowing the transaction commit to complete
When the time to perform this operation is significant, the primary incurs a transaction delay, essentially slowing down writes to the system.
On the primary replica, the transaction delay shows up in several ways; threads waiting on the HADR_SYNC_COMMIT wait type and in the SQL Server:Database Replica\Transaction Delay performance counter for each of the AG databases that are primary replicas. The value for this counter is the cumulative delay across all the active transactions committing on the primary replica.
It’s important to note all synchronous-commit configurations will have some transaction delay associated with hardening the log records on the synchronous-commit secondary replicas, unless there are zero data modifications occurring in the database. To understand what the average delay per transaction is, the transaction delay has to be divided by the SQL Server:Database Replica\Mirrored Write Transactions/sec counter value.
Asynchronous-commit secondary replicas don’t impose transaction delays on the primary replica, but they have other potential effects. The transaction log records on the primary replica can’t be marked for reuse until all the secondary replicas have received the records. One of the key metrics to track for any availability database is the Log Send Queue size, especially for asynchronous replicas that are remote on a slower connection, as the latency of the slower connection can cause the transaction log of all the secondary replicas to grow if the log space can’t be reused due to log send queuing for a secondary.
Troubleshooting Synchronous Commit Latency
If an AG is experiencing synchronous-commit latency, the problem is typically related to:
- The primary replica hardening the log blocks to disk (I/O latency)
- The secondary replica hardening the log blocks to disk (I/O latency)
- The network latency between the primary and secondary
- Flow control thresholds being exceeded at either the database level or transport level as a result of one of the above
Extended events (shown in the table below) for each of the steps of the commit process allow for diagnosing exactly where the bottleneck is during synchronous commit.
Event Order |
Location |
Extended Event Firing |
1 |
Primary |
hadr_log_block_group_commit |
2 |
Primary |
log_block_pushed_to_logpool |
3 |
Primary |
log_flush_start |
4 |
Primary |
hadr_capture_log_block |
5 |
Primary |
hadr_log_block_compression (if enabled) |
6 |
Primary |
hadr_log_block_send_complete |
7 |
Primary |
file_write_completed |
8 |
Primary |
log_flush_complete |
9 |
Secondary |
hadr_transport_receive_log_block_message |
10 |
Secondary |
hadr_log_block_decompression (if enabled) |
11 |
Secondary |
hadr_apply_log_block |
12 |
Secondary |
log_block_pushed_to_logpool |
13 |
Secondary |
log_flush_start |
14 |
Secondary |
file_write_completed |
15 |
Secondary |
log_flush_complete |
16 |
Secondary |
hadr_lsn_send_complete |
17 |
Primary |
hadr_receive_harden_lsn_message |
18 |
Primary |
hadr_db_commit_mgr_update_harden |
19 |
Primary |
hadr_db_commit_mgr_harden |
20 |
Primary |
hadr_log_block_group_commit |
21 |
Primary |
recovery_unit_harden_log_timestamps |
The timing between the log_flush_start and log_flush_completed events for the same log_block_id value is the latency of the log flush operation. The file_write_completed event also has the duration in milliseconds associated with the flush. This makes determining if I/O latency is causing the bottleneck relatively simple for either replica without having access to PerfMon in Windows. Determining if the transport is the bottleneck is a bit more difficult.
The primary hadr_log_block_send_complete event has the log_block_id associated with the message. It will correlate to the log_block_id of the secondary replica’s hadr_transport_receive_log_block_message events, allowing the timestamps to be compared to get the primary-to-secondary transport duration. However, the acknowledgement message hadr_send_harden_lsn_message event from secondary to primary doesn’t have the same log_block_id associated with it; instead, it provides the hardened log_block_id ending value, which is always going to be higher than the starting log_block_id being hardened.
The hadr_db_commit_mgr_update_harden event contains the prior_hardened_log_block and new_hardened_log_block values that allow you to correlate these values to the secondary replica hadr_log_block_send_complete event and the primary replica hadr_receive_harden_lsn_message events by comparing the timestamps. The timestamps between these events allow for separation of send versus acknowledge latency over the wire between the replicas.
Leveraging Distributed AGs
One of the feature enhancements in SQL Server 2016 related to AGs was the introduction of distributed AGs. Although distributed AGs lack the ability to automatically failover and don’t have a shared listener for automatic redirection, they allow multiple AGs to participate in a distributed fan-out design, which increases the total number of replicas that can be configured. More importantly, distributed AGs decrease the network utilization between data centers. Additionally, distributed AGs don’t require all the AG replicas to be members of the same clustering configuration, or even on the same OS, allowing heterogenous/hybrid environments that previously weren’t possible.
From a performance standpoint, distributed AGs begin to have the most impact when there are multiple replicas in geo-distributed configurations. In a traditional AG deployment, the primary replica has to send the log records to each of the connected secondary replicas. This consumes worker threads in the HADR thread pool and increases the amount of network activity.
If the configuration has two replicas in the primary data center and two additional replicas in another data center for disaster recovery purposes, then two copies of the log records have to be sent from the primary data center to the disaster recovery data center. In a distributed AG setup, the primary replica of the AG in the primary data center sends the log records to only the primary replica of the AG in the disaster recovery data center. The primary replica of the AG in the disaster recovery data center then acts as a forwarder and sends the log records to the secondary replica in the disaster recovery data center. In this example scenario, the network utilization between data centers is reduced by half.
Compression Settings
AGs don’t use compression for the log stream for synchronous-commit replicas by default because it would introduce CPU overhead and add latency for the remote hardening of the log blocks. Compression is used for asynchronous-commit replicas by default as a means of reducing network overhead for the AG transport. The latency for transmitting the log blocks to the secondary doesn’t affect transaction commit completion on the primary. The compression behavior of log blocks for replicas can be modified using the following trace flags in SQL Server:
- Trace flag 1462—Disable log stream compression for asynchronous replicas
- Trace flag 9592—Enable log stream compression for synchronous replicas
Why would you want to enable log stream compression for a synchronous-commit secondary replica with an AG? If you have a zero-data loss requirement between data centers and the throughput/round-trip latency between sites is introducing transaction delay on the primary replica, it might be worth the additional CPU overhead to compress the log records before pushing them over the wire. You might also want to disable the log compression for an asynchronous replica that goes over the WAN through an accelerator device that’s already providing compression of the packets between data centers. This is because these appliances might bundle packets together before compressing them.
In addition to the log stream compression trace flags, if you use automatic seeding for AG databases, the VDI backup stream sent from the primary replica to the secondary replica (to initialize the database for joining the AG) isn’t compressed by default. This can result in a large volume of data being transmitted between the servers, but it reduces the CPU overhead of performing the seeding operation. Trace flag 9567 (enable compression of VDI backup stream for automatic seeding), allows the automatic seeding operation to use compression.
Any of these trace flags can be turned on/off dynamically or enabled as a startup trace flag for the instance in SQL Server Configuration Manager.
Summary
As you can see, there are a lot of considerations related to AGs with an impact on performance of different parts of the architecture. Although the default behaviors work best for the majority of implementations, there are cases where additional troubleshooting could warrant changes to those behaviors. Understanding how transaction overhead is potentially introduced in a synchronous-commit AG is critical for a successful end-user experience, as is understanding the impacts of having remote replicas in the configuration, even if they’re asynchronous.
Sometimes the “best” design for performance might not be the easiest design for management or implementation. But with the newer features and enhancements Microsoft has provided, it’s possible to get high throughput performance and minimize the impacts of having an AG implementation.
Managing the status and health of Always On AGs can be challenging. Find out how SolarWinds® SQL Sentry can help you manage and monitor your SQL Server AG environment.
Jonathan Kehayias is a Principal Consultant with SQLskills. He is the youngest person ever to achieve the Microsoft Certified Master – SQL Server certification and has been a SQL Server MVP for many years. Jonathan is heavily involved in the SQL community, including being a top answerer on the MSDN Database Engine forum, writing deep technical content on his blog, and answering questions as @SQLPoolBoy on Twitter.