Merge Replication Identity Range Management - An Identity Disorder? Part 1
Those who work with Merge replication know replicating articles with identity columns need some planning, especially if the identity column is part of the primary key or if there are many subscribers. Merge replication provides automatic identity range management across replication nodes that can offset the overhead of managing them manually. This blog will briefly shed some light on identity range management and how it works. In part 2, we will explore a specific case when batch inserts can affect the identity range management.
Why Identity Columns ? Identity columns are very popular (and even touted more by DBA's) to use as Primary Keys other than GUIDs for multiple reasons, including :
- They are smaller in size ; GUIDs are four times larger than INT
- Indexes over GUIDs tend to be larger. If the GUID is clustered index then it will affected every NON-Clustered index as well. Check out those two excellent blogs about GuIDs as PK and as Clustered indexes. https://www.sqlskills.com/blogs/kimberly/category/guids/
- Because they are random, indexes fragmentation are common on indexes over GUIDs. You can reduce this by using NEWSEQUENTIALID() function but it has a security disadvantage https://technet.microsoft.com/en-us/library/ms189786.aspx
How Merge Replication Manages Identity Ranges: So you have a few articles with identity columns that you want to replicate between a publisher and few subscriptions in a merge replication setup so the subscribers will also be able to update data back. Each “node” in the replication topology should be assigned a range which the user processes will use and then replicate to other nodes. It is important, especially if identity column is part of primary key, that no duplicates happen. Let’s say we want to publish one article with identity column as a primary key to two subscribers. If we do not assign the correct ranges to each “node” here, then it’s possible that user processes will create same identity values. Once the merge agent tries to synchronize, it will insert duplicate values to the tables from other nodes and will fail if identity column is part of a primary key. If we designate that each “node” gets isolated range of identity values, then user application will write values that cannot be used elsewhere. This is simply what Automatic Identity range does! - On the publisher , the identity pool is used to assign values when adding new articles. There are two kinds of pools (ranges)
- Pool assigned to publisher and subscribers that will be used directly to insert data in these subscribers.This is controlled through @identity_range parameter of sp_addmergearticle
- If any of the subscribers are SERVER subscriptions , which can possibly republish data , they are assigned a pool (range) so that they can use their own subscribers. Usually this pool is larger than the pool in point one because it can serve multiple subscribers. This is controlled through @pub_identity_range parameter of sp_addmergearticle
@pub_identity_range : has a role when the subscribers also re-publish data because those republishing subscribers will synchronize data with their own subscribers and eventually will synchronize with the original publisher. It’s not a common practice but can happen , we won’t talk about it here.Let’s create the subscriptions and run the snapshot: [sourcecode language="sql"] use [Pub] exec sp_addmergesubscription @publication = N'Pub1', @subscriber = N'MYINSTANCE', @subscriber_db = N'sub1', @subscription_type = N'Push', @sync_type = N'Automatic', @subscriber_type = N'Global', @subscription_priority = 75, @use_interactive_resolver = N'False' Declare @instance nvarchar(1000) Set @instance = @@servername; exec sp_addmergepushsubscription_agent @publication = N'Pub1', @subscriber =@instance, @subscriber_db = N'sub1', @subscriber_security_mode = 1, @publisher_security_mode = 1; GO exec sp_addmergesubscription @publication = N'Pub1', @subscriber = N'MYINSTANCE', @subscriber_db = N'sub2', @subscription_type = N'Push', @sync_type = N'Automatic', @subscriber_type = N'Global', @subscription_priority = 75, @use_interactive_resolver = N'False' Declare @instance nvarchar(1000) Set @instance = @@servername; exec sp_addmergepushsubscription_agent @publication = N'Pub1', @subscriber =@instance, @subscriber_db = N'sub2', @subscriber_security_mode = 1, @publisher_security_mode = 1; GO -- Start snapshot agent Exec pub..sp_startpublication_snapshot 'pub1' GO Waitfor delay '00:00:15' GO [/sourcecode] Get snapshot agent output: Exec sp_MSenum_replication_agents @type = 1 GO
dbname | name | status | publisher | publisher_db | publication | start_time | time | duration | comments |
distribution | MYINSTANCE-Pub-Pub1-2 | 2 | MYINSTANCE | Pub | Pub1 | 20140118 18:05:30.750 | 20140118 18:05:40.523 | 10 | [100%] A snapshot of 1 article(s) was generated. |
Session_id | Status | StartTime | EndTime | Duration | UploadedCommands | DownloadedCommands | ErrorMessages | PercentageDone | LastMessage |
3 | 2 | 1/18/2014 18:14:42 | 1/18/2014 18:14:54 | 13 | 0 | 0 | 0 | 100 | Applied the snapshot and merged 0 data change(s) (0 insert(s),0 update(s), 0 delete(s), 0 conflict(s)). |
publication | article | subscriber | subscriber_db | range_begin | range_end | is_pub_range | next_range_begin | next_range_end |
Pub1 | tbl_pub | MYINSTANCE | Pub | 1 | 101 | 0 | 101 | 201 |
Pub1 | tbl_pub | MYINSTANCE | sub2 | 201 | 301 | 0 | 301 | 401 |
Pub1 | tbl_pub | MYINSTANCE | sub1 | 401 | 501 | 0 | 501 | 601 |
Pub1 | tbl_pub | MYINSTANCE | sub2 | 601 | 1601 | 1 | 1601 | 2601 |
Pub1 | tbl_pub | MYINSTANCE | sub1 | 2601 | 3601 | 1 | 3601 | 4601 |
Node | name | definition |
Publisher | repl_identity_range_21A0E50F_D0F6_4E43_9FE1_349C1210247F | ([col1]>=(1) AND [col1](101) AND [col1]<=(201)) |
Sub1 | repl_identity_range_21A0E50F_D0F6_4E43_9FE1_349C1210247F | ([col1]>(401) AND [col1](501) AND [col1]<=(601)) |
Sub2 | repl_identity_range_21A0E50F_D0F6_4E43_9FE1_349C1210247F | ([col1]>(201) AND [col1](301) AND [col1]<=(401)) |
publication | article | subscriber | subscriber_db | range_begin | range_end | is_pub_range | next_range_begin | next_range_end |
Pub1 | tbl_pub | MYINSTANCE | Pub | 4601 | 4701 | 0 | 4701 | 4801 |
exec sys.sp_MSrefresh_publisher_idrange '[dbo].[tbl_pub]', '95208516-4B98-451D-B264-C1F27B20449A', '21A0E50F-D0F6-4E43-9FE1-349C1210247F', 2, 1Some people complained about this, but, no dice! https://connect.microsoft.com/SQLServer/feedback/details/330476/identity-range-not-working-for-merge-replication-in-sql-server-2005
Conclusion Here we learnt few points: - Merge replication can handle identity ranges automatically. - Starting with SQL 2005 , there are two range pools assigned to each publisher and subscriber. - A republishing subscriber will have a "republishing" range to use for its own subscribers. - The size of each pool can be configured. Some recommendations here as well:
- You need to choose the datatype of your identity column carefully because you do not want the subscribers to run out of range. SMALLINT, for example, allows -2^15 (-32,768) to 2^15-1 (32,767) which is quite low for any production system while BIGINT is generous here with range of -2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807), it consumes more storage though.
- When adding articles , choose a range that can cover your operations without the need to frequently assign new ones. If you choose a datatype like BIGINT then you can be a bit generous as a safe factor.
- Plan the subscribers sync before the subscribers consume all range values so that new ranges are assigned. You can monitor the ranges consumption and kick off sync manually.
- On publisher , if processes inserting data are not part of a db_owner, then you will need to run agent or manually run sp_adjustpublisheridentityrange
Share this
Previous story
← Adaptive Log File Sync: Oracle, Please Don't Do That Again
You May Also Like
These Related Stories
Vanakkam from Chennai to Bangalore
Vanakkam from Chennai to Bangalore
Jun 3, 2022
5
min read
From Cambridge to West London
From Cambridge to West London
Feb 7, 2022
6
min read
First touch penalty on AWS cloud
First touch penalty on AWS cloud
Mar 26, 2018
6
min read
No Comments Yet
Let us know what you think