Pythian Blog: Technical Track

Graceful master switchover with ProxySQL and Orchestrator

Introduction

One of the things I like about Continuent Tungsten is how Tungsten Connector can hold traffic while a (graceful) master switch is taking place. This means the application may experience a brief spike in latency, but has not returned any errors from the database layer. René also described a similar process using ProxySQL and mysqlrpladmin here. In this post we will try to achieve the same feat, a graceful master switchover with ProxySQL and Orchestrator.

ProxySQL considerations

ProxySQL needs to isolate our application from the changes going on at the database layer. By design, if a query needs to be sent to a hostgroup that has no servers in ONLINE state, ProxySQL waits until either a server becomes available or a timeout expires (mysql-connect_timeout_server_max if I am not mistaken). Another important thing to keep in mind: when a server is put into OFFLINE_SOFT mode, new connections aren't accepted anymore. Existing connections with active queries or transactions are allowed to continue work until the current operation is completed. This sounds good in theory, but what happens if there is a long running transaction? We don't want that to delay our master switch for too long, as timeouts will start to be a problem. Due to this, it is usually a good idea to define some kind of grace period (couple of seconds at most), and then kill all outstanding transactions.

Orchestrator and graceful master switch

Orchestrator will drive topology changes, and we need to use the hooks it exposes to interact with ProxySQL. For a graceful master switch, as described above we need to set the master in OFFLINE_SOFT state in ProxySQL as the first step. I found that the available hook (PostFailoverProcesses) is executed too late for this, as it is triggered after the demoted master has already been placed in read_only mode by Orchestrator. This is evidenced by the following extract from the logs:
Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 INFO topology_recovery: Running 1 PostFailoverProcesses hooks
 Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 INFO topology_recovery: Running PostFailoverProcesses hook 1 of 1: /root/postfailover.sh
 Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 INFO CommandRun(/root/postfailover.sh,[])
 Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 INFO CommandRun/running: bash /tmp/orchestrator-process-cmd-542186076
 Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 DEBUG outdated keys: [mysql3:3306]
 Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 DEBUG Discovered host: mysql3:3306, master: mysql2:3306, version: 5.6.39-log in 0.009s (Backend: 0.006s, Instance: 0.004s)
 ...
 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO topology_recovery: done running PostFailoverProcesses hooks
 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO topology_recovery: Waiting for 0 postponed functions
 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 DEBUG PostponedFunctionsContainer: waiting on 0 postponed functions
 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 DEBUG PostponedFunctionsContainer: done waiting
 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO topology_recovery: Executed 0 postponed functions
 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 DEBUG ChangeMasterTo: will attempt changing master on mysql1:3306 to mysql2:3306, mysql-bin.000013:976
 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO ChangeMasterTo: Changed master on mysql1:3306 to: mysql2:3306, mysql-bin.000013:976. GTID: false
 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 DEBUG ChangeMasterTo: will attempt changing master credentials on mysql1:3306
 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO ChangeMasterTo: Changed master credentials on mysql1:3306
 Apr 04 11:30:55 mysql1 orchestrator[4453]: [martini] Completed 200 OK in 5.515153338s
 
I've opened an issue to suggest a new hook is added, and the fix is already in one of the dev branches of Orchestrator. It also makes sense to add the old master to the readers hostgroup after the promotion. The problem is there is currently no way to have Orchestrator start the slave threads on the old master. I opened another issue to explore options to deal with this. Finally, we need to take care of setting the old master back ONLINE in ProxySQL hosts table (remember we have to set it to OFFLINE_SOFT for the purpose of master switch).

Putting it all together

ProxySQL can detect and handle master changes based on read_only value using the mysql_replication_hostgroups table, but that approach has some known shortcomings. I won't be using that for this example (in fact that table is empty on my test environment), so Orchestrator will handle the needed changes in ProxySQL configuration. We will need a ProxySQL admin user for the hooks that can connect externally (called ivan on my case). Here is an example of ProxySQL initial status:
(admin@127.0.0.1) [(none)]>select * from runtime_mysql_servers;
 +--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
 | hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
 +--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
 | 0 | mysql1 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
 | 1 | mysql1 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
 | 1 | mysql2 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
 | 1 | mysql3 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
 +--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
 
I have the master in both writer and reader hostgroups, so if you wanted to use read/write splitting the master is still able to service read traffic. We need to develop two scripts that will be run as Orchestrator hooks for pre and post failover tasks, and configure Orchestrator to invoke them at the appropriate time.

Orchestrator configuration

We need the following lines in Orchestrator config file e.g. /etc/orchestrator.conf.json
 "ApplyMySQLPromotionAfterMasterFailover": true,
  "PreGracefulTakeoverProcesses": [
  "/tmp/prefailover.sh"
  ],
  "PostMasterFailoverProcesses": [
  "/tmp/postfailover.sh"
 
Make sure to have Orchestrator refresh the configuration by either running `service orchestrator reload` or call /api/reload-configuration after you make the changes.

Prefailover hook

For the prefailover script, we stop accepting new connections to the old master (OFFLINE_SOFT state), and loop while connections are still active and we are in the grace period (1 second on the example). After that, we let Orchestrator proceed with failover steps, which means read_only will be set on the current master and the topology will be refactored.
#!/bin/bash
 
 # Variable exposed by Orchestrator
 OldMaster=$ORC_FAILED_HOST
 PROXYSQL_HOST="mysql3"
 
 # stop accepting connections to old master
 (
 echo 'UPDATE mysql_servers SET STATUS="OFFLINE_SOFT" WHERE hostname="'"$OldMaster"'";'
 echo "LOAD MYSQL SERVERS TO RUNTIME;"
 ) | mysql -vvv -uivan -p**** -h ${PROXYSQL_HOST} -P6032
 
 # wait while connections are still active and we are in the grace period
 CONNUSED=`mysql -uivan -p**** -h ${PROXYSQL_HOST} -P6032 -e 'SELECT IFNULL(SUM(ConnUsed),0) FROM stats_mysql_connection_pool WHERE status="OFFLINE_SOFT" AND srv_host="'"$OldMaster"'"' -B -N 2> /dev/null`
 TRIES=0
 while [ $CONNUSED -ne 0 -a $TRIES -ne 20 ]
 do
  CONNUSED=`mysql -uivan -p**** -h ${PROXYSQL_HOST} -P6032 -e 'SELECT IFNULL(SUM(ConnUsed),0) FROM stats_mysql_connection_pool WHERE status="OFFLINE_SOFT" AND srv_host="'"$OldMaster"'"' -B -N 2> /dev/null`
  TRIES=$(($TRIES+1))
  if [ $CONNUSED -ne "0" ]; then
  sleep 0.05
  fi
 done
 

Postfailover hook

For the postfailover script, we need to tell ProxySQL who the new master is, and in the case of a graceful takeover, put the old master ONLINE in the readers hostgroup and start the slave threads on it. As mentioned in the previous section, in the current Orchestrator version the post failover hook is fired before the actual changes to the topology are performed, so we cannot issue start slave to the old master. Here is a sample script postfailover.sh:
#!/bin/bash
 
 OldMaster=$ORC_FAILED_HOST
 NewMaster=$ORC_SUCCESSOR_HOST
 PROXYSQL_HOST="mysql3"
 
 # remove old master from writers hostgroup
 (
 echo 'DELETE FROM mysql_servers WHERE hostgroup_id=0 AND hostname="'"$OldMaster"'";'
 echo 'LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;'
 ) | mysql -vvv -uivan -p**** -h ${PROXYSQL_HOST} -P6032
 
 # promote the new master by adding to the writers hostgroup
 (
 echo 'INSERT INTO mysql_servers(hostgroup_id,hostname,port,status) values (0, "'"$NewMaster"'", 3306, "ONLINE");'
 echo 'LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;'
 ) | mysql -vvv -uivan -p**** -h ${PROXYSQL_HOST} -P6032
 
 # if graceful then set old master ONLINE in read hostgroup and start replication
 if [ "$ORC_COMMAND" == "graceful-master-takeover" ]
 then
 (
 echo 'UPDATE mysql_servers SET status="ONLINE" WHERE hostgroup_id=1 AND hostname="'"$OldMaster"'";'
 echo 'LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;'
 ) | mysql -vvv -uivan -p**** -h ${PROXYSQL_HOST} -P6032
 
 # start replication threads on the old master
 # not working for now as hook is fired too soon https://github.com/github/orchestrator/issues/453
 #mysql -vvv -h$OldMaster -uorchestrator -p**** -e'start slave;'
 fi
 

Testing

Let's run a quick test using sysbench in read/write mode and see what happens when I do a master switch in a simple topology (1 master, 2 slaves):
[root@mysql1 ~]# sysbench --report-interval=1 --db-driver=mysql --num-threads=4 --max-requests=0 --max-time=2000 --test=/usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-user=app --mysql-password=**** --mysql-host=mysql3 --mysql-port=6033 --mysql-db=test --oltp-read-only=off run
 WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
 WARNING: --num-threads is deprecated, use --threads instead
 WARNING: --max-time is deprecated, use --time instead
 sysbench 1.0.9 (using system LuaJIT 2.0.4)
 
 Running the test with following options:
 Number of threads: 4
 Report intermediate results every 1 second(s)
 Initializing random number generator from current time
 
 
 Initializing worker threads...
 
 Threads started!
 
 [ 1s ] thds: 4 tps: 116.62 qps: 2367.35 (r/w/o: 1661.63/467.49/238.23) lat (ms,95%): 57.87 err/s: 0.00 reconn/s: 0.00
 [ 2s ] thds: 4 tps: 78.99 qps: 1580.79 (r/w/o: 1105.85/313.96/160.98) lat (ms,95%): 80.03 err/s: 0.00 reconn/s: 0.00
 [ 3s ] thds: 4 tps: 78.01 qps: 1569.30 (r/w/o: 1097.21/312.06/160.03) lat (ms,95%): 81.48 err/s: 0.00 reconn/s: 0.00
 [ 4s ] thds: 4 tps: 118.03 qps: 2360.57 (r/w/o: 1656.40/465.11/239.06) lat (ms,95%): 42.61 err/s: 0.00 reconn/s: 0.00
 [ 5s ] thds: 4 tps: 111.96 qps: 2242.21 (r/w/o: 1573.45/439.85/228.92) lat (ms,95%): 43.39 err/s: 0.00 reconn/s: 0.00
 [ 6s ] thds: 4 tps: 83.92 qps: 1635.45 (r/w/o: 1133.92/334.68/166.84) lat (ms,95%): 77.19 err/s: 0.00 reconn/s: 0.00
 [ 7s ] thds: 4 tps: 92.03 qps: 1878.63 (r/w/o: 1325.44/367.12/186.06) lat (ms,95%): 66.84 err/s: 0.00 reconn/s: 0.00
 [ 8s ] thds: 4 tps: 81.08 qps: 1604.53 (r/w/o: 1117.07/322.31/165.16) lat (ms,95%): 70.55 err/s: 0.00 reconn/s: 0.00
 [ 9s ] thds: 4 tps: 90.99 qps: 1822.79 (r/w/o: 1276.85/363.96/181.98) lat (ms,95%): 69.29 err/s: 0.00 reconn/s: 0.00
 [ 10s ] thds: 4 tps: 100.95 qps: 2018.09 (r/w/o: 1408.37/405.82/203.91) lat (ms,95%): 47.47 err/s: 0.00 reconn/s: 0.00
 [ 11s ] thds: 4 tps: 102.47 qps: 2059.27 (r/w/o: 1448.45/402.90/207.92) lat (ms,95%): 66.84 err/s: 0.00 reconn/s: 0.00
 [ 12s ] thds: 4 tps: 79.44 qps: 1579.83 (r/w/o: 1103.16/316.77/159.89) lat (ms,95%): 89.16 err/s: 0.00 reconn/s: 0.00
 [ 13s ] thds: 4 tps: 69.80 qps: 1419.90 (r/w/o: 993.13/285.18/141.59) lat (ms,95%): 108.68 err/s: 0.00 reconn/s: 0.00
 [ 14s ] thds: 4 tps: 53.05 qps: 1067.93 (r/w/o: 756.66/204.18/107.09) lat (ms,95%): 127.81 err/s: 0.00 reconn/s: 0.00
 [ 15s ] thds: 4 tps: 77.99 qps: 1556.87 (r/w/o: 1081.91/317.97/156.99) lat (ms,95%): 108.68 err/s: 0.00 reconn/s: 0.00
 [ 16s ] thds: 4 tps: 87.18 qps: 1704.50 (r/w/o: 1189.44/339.70/175.36) lat (ms,95%): 75.82 err/s: 0.00 reconn/s: 0.00
 [ 17s ] thds: 4 tps: 75.99 qps: 1559.79 (r/w/o: 1099.85/303.96/155.98) lat (ms,95%): 71.83 err/s: 0.00 reconn/s: 0.00
 [ 18s ] thds: 4 tps: 62.97 qps: 1255.37 (r/w/o: 878.56/247.88/128.93) lat (ms,95%): 63.32 err/s: 0.00 reconn/s: 0.00
 [ 19s ] thds: 4 tps: 77.03 qps: 1491.55 (r/w/o: 1034.38/305.11/152.06) lat (ms,95%): 73.13 err/s: 0.00 reconn/s: 0.00
 [ 20s ] thds: 4 tps: 82.90 qps: 1703.04 (r/w/o: 1195.62/335.61/171.80) lat (ms,95%): 68.05 err/s: 0.00 reconn/s: 0.00
 
Can you spot the switchover? I cannot! It was done around second 17 via drag and dropping one of the slaves to the master position in Orchestrator GUI.

Conclusion

We've seen how ProxySQL and Orchestrator can be integrated by leveraging the hooks functionality of Orchestrator to make changes in ProxySQL configuration. This is a better approach than having ProxySQL decide who the master is based on the value of read_only variable alone (by using mysql_replication_hostgroups table). The hooks will handle topology changes regardless of whether it's a response to a failure OR a graceful takeover. We also performed a graceful master switchover with ProxySQL and Orchestrator via the GUI that completed in under 1 second, without returning any errors to the application. The only missing piece to fully automate a graceful master switchover is starting the slave threads automatically after the old master is set as a slave, but I expect this will be solved very soon either via a new hook or a new Orchestrator parameter.

No Comments Yet

Let us know what you think

Subscribe by email