Pythian Blog: Technical Track

Magic Knobs in Replication Agents

Have you ever wondered about tuning your replication configuration to achieve better performance or you got into a situation of huge latency and were looking for an easier way to tune it. In this blog post, I will explain the very useful switches which are not always that obvious. Snapshot Agent BCPBatchSize - is the number of rows to send in a bulk copy operation. When performing a bcp in operation, the batch size is the number of rows to send to the server as one transaction, and also the number of rows that must be sent before the Distribution Agent logs a bcp progress message. When performing a bcp out operation, a fixed batch size of 1000 is used. A value of 0 indicates no message logging. MaxBCPThreads - Specifies the number of bulk copy operations that can be performed in parallel. The maximum number of threads and ODBC connections that exist simultaneously is the lesser of MaxBcpThreads or the number of bulk copy requests that appear in the synchronization transaction in the distribution database. MaxBcpThreads must have a value greater than 0 and has no hard-coded upper limit. The default is 1. Log Reader Agent MaxCmdsInTran Specifies the maximum number of statements grouped into a transaction as the Log Reader writes commands to the distribution database. Using this parameter allows the Log Reader Agent and Distribution Agent to divide large transactions (consisting of many commands) at the Publisher into several smaller transactions when applying commands at the Subscriber. This is useful to decrease the latency caused by very large transactions. It would break the single transaction into smaller chunks. Distribution Agent CommitBatchSize is the number of transactions to be issued to the Subscriber before a COMMIT statement is issued. The default is 100. Committing a set of transactions has a fixed overhead; by committing a larger number of transactions less frequently, the overhead is spread across a larger volume of data. However, the benefit of increasing this parameter drops off as the cost of applying changes is gated by other factors, such as the maximum I/O of the disk that contains the log. Additionally, there is a trade off to be considered: any failure that causes the Distribution Agent to start over must rollback and reapply a larger number of transactions. For unreliable networks, a lower value can result in fewer failures and a smaller number of transactions to rollback and reapply if a failure occurs. CommitBatchThreshold is the number of replication commands to be issued to the Subscriber before a COMMIT statement is issued. The default is 1000 MaxBCPThreads - Specifies the number of bulk copy operations that can be performed in parallel. The maximum number of threads and ODBC connections that exist simultaneously is the lesser of MaxBcpThreads or the number of bulk copy requests that appear in the synchronization transaction in the distribution database. MaxBcpThreads must have a value greater than 0 and has no hard-coded upper limit. The default is 1. UseInprocLoader Improves the performance of the initial snapshot by causing the Distribution Agent to use the BULK INSERT command when applying snapshot files to the Subscriber. This parameter is deprecated because it is not compatible with the XML data type. If you are not replicating XML data, this parameter can be used. This parameter cannot be used with character mode snapshots or non-SQL Server Subscribers. If you use this parameter, the SQL Server service account at the Subscriber must have read permissions on the directory where the snapshot .bcp data files are located. When this parameter is not used, the agent (for non-SQL Server Subscribers) or the ODBC driver loaded by the agent (for SQL Server Subscribers) reads from the files, so the security context of the SQL Server service account is not used. Alternate Snapshot Folder When you apply the snapshot over the slow network it can take a significant amount of time to complete. It would probably be a good idea to copy all of the snapshot files over using another method, like transferring the content to a USB drive and sending it across to the destination and plug in there. Once you get the files locally, you can use this parameter in the distribution agent, which will pick the files from local disk and apply it. Subscription Streams can greatly improve aggregate replication throughput. It allows multiple connections to a Subscriber to apply batches of changes in parallel, while maintaining many of the transactional characteristics present when using a single thread. If one of the connections fails to execute or commit, all connections will abort the current batch, and the agent will use a single stream to retry the failed batches. Before this retry phase completes, there can be temporary transactional inconsistencies at the Subscriber. After the failed batches are successfully committed, the Subscriber is brought back to a state of transactional consistency. Thanks for reading! Stay tuned for more replication related blog posts soon.

No Comments Yet

Let us know what you think

Subscribe by email