Pythian Blog: Technical Track

Automating Tungsten upgrades using Ansible

Continuent Tungsten is one of the few all-in-one solutions for MySQL high availability. In this post I will show you how to automate the upgrade process using Ansible. I will walk you through the individual tasks and, finally, give you the complete playbook.

We will use a rolling approach, upgrading the slaves first, and finally upgrading the former master. There is no need for a master switch, as the process is transparent to the application.

I am assuming you are using the .ini based installation method. If you are still using staging host, I suggest you update your set up.

Pre tasks

The first step is ensuring the cluster is healthy, because we don't want to start taking nodes offline unless we are sure the cluster is in good shape. One way of doing that is by using the built-in script tungsten_monitor. When we run the playbook, we only need to validate the cluster status on one node, so I am adding run_once: true to limit this task to a single node.

    - name: verify cluster is healthy
shell: "/opt/continuent/tungsten/cluster-home/bin/tungsten_monitor"
run_once: true

Next, we may want to do some parameter changes as part of the upgrade. The following tasks will help deal with that. The first one adds a line to the end of the file, while the second will look for a line starting with skip-validation-check, and add the new line right after.

Notice I am using the become: yes so that the tasks will run as root.

    - name: Disable bridge mode for connector
become: yes
lineinfile:
path: /etc/tungsten/tungsten.ini
line: connector-bridge-mode=false
insertafter: EOF

- name: Disable the check for modified files
become: yes
lineinfile:
path: /etc/tungsten/tungsten.ini
line: skip-validation-check=ModifiedConfigurationFilesCheck
insertafter: '^skip-validation-check'

Now we need to know the current master node for the purpose of the upgrade, so we get that from cctrl and store it as a fact so we can reference it further down the road.

    - name: Capture Master Node
become: yes
shell: "su - tungsten -c \"echo 'ls' | cctrl \" | grep -B 5 'REPLICATOR(role=master, state=ONLINE)' | grep progress | cut -d'(' -f1 | sed 's/|//'"
changed_when: False
run_once: true
register: master_node_result

The cluster might have a different master now than when it was initially provisioned, so we need to update the tungsten.ini file accordingly

    - name: Replace current master in tungsten.ini file
become: yes
lineinfile:
path: /etc/tungsten/tungsten.ini
regexp: '^master='
line: 'master='

Now we need to set downtime on the monitoring system. The delegate_to is useful for when you want to run a task on a different host than the one ansible is running against.

    - name: Downtime alerts
shell: "/usr/local/bin/downtime --host=' --service='{{item}}' --comment='Downtime services for Tungsten Upgrade'"
with_items: [ 'Tungsten policy mode', 'Tungsten Replicator Status', 'Tungsten THL Retention' ]
run_once: true
delegate_to: nagios_host.example.com

Next, we set the cluster policy to maintenance to prevent any automatic operations from interfering with our upgrade.

    - name: Set Policy to Maintenance
become: yes
shell: "su - tungsten -c \"echo 'set policy maintenance' | cctrl\""
run_once: true
register: maintenance_results

- debug: var=maintenance_results.stdout_lines

Upgrade process

Now we are ready to start the upgrade on the slaves. The approach I am taking here is to copy the rpm file from my local machine to the servers and install using yum. The tungsten rpm package does everything automatically, provided the tungsten.ini file exists.

    - name: copy Tungsten packages
copy:
src: /vagrant/tungsten
dest: /tmp

- name: Install Continuent Tungsten rpm
yum:
name:
- /tmp/tungsten/tungsten-clustering-5.3.5-623.x86_64.rpm
state: present

Once the slaves are done, we operate on the master and bring the replicator online on it.

        - name: Upgrade Continuent Tungsten rpm on master
become: yes
yum:
name:
- tungsten-clustering
state: present

- name: Online cluster replicator on master
shell: "su - tungsten -c \"trepctl online\""

After that is done, we can set the policy to automatic again, which will bring the slave replicators online.

        - name: Set Policy to automatic
become: yes
shell: "su - tungsten -c \"echo 'set policy automatic' | /opt/continuent/tungsten/tungsten-manager/bin/cctrl\""
register: maintenance_results

- debug: var=maintenance_results.stdout_lines

Wrapping up

Finally, we do a check of the cluster status and bring hosts out of maintenance mode.

    - name: Check cluster status
shell: "su - tungsten -c \"echo ls | cctrl\""
register: maintenance_results
run_once: true

- debug: var=maintenance_results.stdout_lines
run_once: true

- name: Cancel downtime alerts
shell: "/usr/local/bin/downtime --cancel --host=' --service='{{item}}' --comment='Downtime services for Tungsten Upgrade'"
with_items: [ 'Tungsten policy mode', 'Tungsten Replicator Status', 'Tungsten THL Retention' ]
run_once: true
delegate_to: nagios_host.example.com

The full playbook is available here.

Closing thoughts

Ansible is a very powerful tool, and when you are dealing with hundreds or thousands of servers, you can save a lot of time by automating repetitive tasks. I hope you have learned something useful from this post and if you have any comments or Ansible tips you'd like to share, do let me know in the section below.

Happy automating!

No Comments Yet

Let us know what you think

Subscribe by email