I've been following the development of Tungsten Replicator for quiet some time now, and recently was fortunate enough to find the time to take a look at the product in more detail. If you haven't heard of Tungsten Replicator yet, it's an open source database replication engine that can be used to complement or completely replace native MySQL Replication. In addition to providing standard replication functionality, Tungsten Replicator introduces exciting new features such as global transaction IDs, heterogeneous replication from MySQL to Oracle and Postgres, parallel replication, and the ability to replicate from multiple masters to a single slave.
Giuseppe Maxia and
Robert Hodges have been writing some excellent blog posts on Tungsten, providing great detail on both the architecture and functionality of the product. One of Giuseppe's recent posts detailed a new simplified installation procedure so this seemed like a natural place to begin... Just as an aside, I'm a big fan of VirtualBox when it comes to testing on my laptop. I've created several base images which I select from depending on the task, and simply create a new virtual machine by cloning one of these images. I used to do this using a combination of
VBoxManage clonehd
and the VirtualBox gui, but with the release of VirtualBox 4.1 this has been simplified with the introduction of the command
clonevm
, and it can now be achieved through a simple gui wizard or by executing the following command:
VBoxManage clonevm VirtualMachineName --name NewVMName --register
Additionally VirtualBox provides functionality to take a snapshot of a virtual machine, which makes testing different scenarios and rolling back changes very simple. Anyway back to installing Tungsten.. ..I looked at Giuseppe's instructions, and began working through the list of prerequisites. The first prerequisite was to select a Unix-like operating system, using VirtualBox, I chose a 64bit install of CentOS 5.5 as my base image, and cloned and fired up three vm's. The image already had MySQL 5.5 installed, and to simplify testing I set SELinux to
permissive and turned off iptables, also for ease, I assigned each vm a hostname (
tungsten1,
tungsten2, &
tungsten3) and added the references to the /etc/hosts file. The second and third prerequisites specify some required packages, I downloaded and installed the 64bit jre rpm from
www.java.com/en/download/, and installed ruby and ruby-libs using the CentOS updates repository. Next on the list was to create a user account to install and run Tungsten. The prereq's specify the user needs sudo access, ssh access to the other hosts involved, and read access to the MySQL binary logs. For the purposes of the test, as the root user I created an additional user on each machine with access to the required group as follows:
[root@tungsten1 ~]# useradd tungsten -Gmysql [root@tungsten1 ~]# passwd tungsten
I then edited /etc/sudoers and added:
tungsten ALL=(ALL) NOPASSWD: ALL
Then logging in as the user "tungsten", I setup the ssh access:
[tungsten@tungsten1 ~]$ ssh-keygen [tungsten@tungsten1 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub tungsten2 [tungsten@tungsten1 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub tungsten3
Finally I created the mysql user and ensured the binary log was enabled.
mysql> create user tungstenmysql identified by 'tungsten'; mysql> grant all privileges on *.* to tungstenmysql with grant option; mysql> flush privileges; mysql> show global variables like 'log_bin'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | log_bin | ON | +---------------+-------+ 1 row in set (0.00 sec)
With the prerequisites complete, I downloaded and extracted the Tungsten Replicator installation files.
[tungsten@tungsten1 ~]$ wget https://tungsten-replicator.googlecode.com/files/tungsten-replicator-2.0.3.tar.gz [tungsten@tungsten1 ~]$ tar -xzvf tungsten-replicator-2.0.3.tar.gz
I then tried to complete the installation using the command mentioned in Giuseppe's post:
[tungsten@tungsten1 ~]$ cd tungsten-replicator-2.0.3 [tungsten@tungsten1 tungsten-replicator-2.0.3]$ ./tools/tungsten-installer \ --master-slave \ --master-host=tungsten1 \ --datasource-user=tungstenmysql \ --datasource-password=tungsten \ --service-name=rep1 \ --home-directory=/opt/continuent/ \ --cluster-hosts=tungsten1,tungsten2,tungsten3 \ --start
Take 1: The moment of truth:
[tungsten@tungsten1 ~]$ cd /opt/continuent/ [tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/trepctl services
NAME VALUE ---- ----- appliedLastSeqno: -1 appliedLatency : -1.0 role : master serviceName : rep1 serviceType : unknown started : true state : OFFLINE:ERROR Finished services command...
Hmmm time for some research.. ..the Tungsten Replicator project is hosted on Google code, and documentation can be found. By taking a look through the documentation and exploring the filesystem, I found there's a replicator log called trepsvc.log that can be found in the <home-directory>/tungsten/tungsten-replicator/log/ folder. At first glance the log appears a bit noisy, but a simple grep on ERROR and then WARN showed there was an issue reading the binary log files. It seemed to be looking for mysql-bin.index even though I knew from the my.cnf I'd used, the binary logs took the format of mysqld-bin. Taking a look at the tungsten-installer tool on the project wiki here, I found the extended list of options, and saw I could specify a --datasource-log-pattern option (the default for which was as expected, mysql-bin). Additionally I can see options available to specify the --master-log-file and --master-log-pos, which would obviously be useful if setting this up in an already running production environment. Take 2: I removed the current installation and reran the install with the additional --datasource-log-pattern option.
[tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/replicator stop [tungsten@tungsten1 continuent]$ rm -Rf /opt/continuent/*
mysql> drop database tungsten_rep1; mysql> reset master;
[tungsten@tungsten1 continuent]$ ~/tungsten-replicator-2.0.3/tools/tungsten-installer \ --master-slave \ --master-host=tungsten1 \ --datasource-user=tungstenmysql \ --datasource-password=tungsten \ --datasource-log-pattern=mysqld-bin \ --service-name=rep1 \ --home-directory=/opt/continuent/ \ --cluster-hosts=tungsten1,tungsten2,tungsten3 \ --start
Checking the master:
[tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/trepctl services
NAME VALUE ---- ----- appliedLastSeqno: 1 appliedLatency : 0.901 role : master serviceName : rep1 serviceType : local started : true state : ONLINE Finished services command...
Success! Checking the slaves:
[tungsten@tungsten2 continuent]$ ./tungsten/tungsten-replicator/bin/trepctl services
NAME VALUE ---- ----- appliedLastSeqno: -1 appliedLatency : -1.0 role : slave serviceName : rep1 serviceType : unknown started : true state : OFFLINE:ERROR Finished services command...
Hmmm time for some more research.. ..checking the log I could see there was an issue with setting the NO_ENGINE_SUBSTITUTION sql_mode, it looked like it was actually trying to set it to MODE_NO_ENGINE_SUBSTITUTION, this sounded a little buggy. I decided to search all issues listed on the Google code project site, and sure enough I found issue number
112 which described the behavior exactly - it actually looks like there's a fix been submitted already. For now, just so I could proceed with the installation, I decided to remove the NO_ENGINE_SUBSTITION sql-mode option from the my.cnf. Take 3: I removed the current installation and reran the install with the additional --datasource-log-pattern option and without NO_ENGINE_SUBSTITUTION.
[tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/replicator stop [tungsten@tungsten1 continuent]$ rm -Rf /opt/continuent/*
mysql> drop database tungsten_rep1; mysql> reset master;
[tungsten@tungsten1 continuent]$ ~/tungsten-replicator-2.0.3/tools/tungsten-installer \ --master-slave \ --master-host=tungsten1 \ --datasource-user=tungstenmysql \ --datasource-password=tungsten \ --datasource-log-pattern=mysqld-bin \ --service-name=rep1 \ --home-directory=/opt/continuent/ \ --cluster-hosts=tungsten1,tungsten2,tungsten3 \ --start
Checking the master:
[tungsten@tungsten1 continuent]$ ./tungsten/tungsten-replicator/bin/trepctl services
NAME VALUE ---- ----- appliedLastSeqno: 1 appliedLatency : 0.602 role : master serviceName : rep1 serviceType : local started : true state : ONLINE Finished services command...
Success! Checking the slaves:
NAME VALUE ---- ----- appliedLastSeqno: 1 appliedLatency : 0.0 role : slave serviceName : rep1 serviceType : local started : true state : ONLINE Finished services command...
Success! Whilst it wasn't totally seamless, the installation and troubleshooting were both pretty straightforward, especially considering this was literally the first time I've looked at Tungsten in any detail. I'm now able to create tables, and insert test data on the master, and see it replicate to the slaves. Watching replication in action and seeing data actually move from a to b, is truly one of those things that's a joy to see! Having explored the basics of setting up Tungsten Replicator, in the next post, I'll look to explore some of the commands are available to administer the engine.