Pythian Blog: Technical Track

Rejoining a Node to MySQL's InnoDB cluster

What is InnoDB Cluster?

Hot off of Oracle Open World 2016 is the lab release of MySQL's InnoDB Cluster. InnoDB Cluster uses the Group Replication plugin to allow for virtually synchronous replication, while also providing a MySQL Router that is aware of the cluster state. By connecting your application to the router, your application will be able to withstand failover of any node participating in the cluster. InnoDB Cluster also provides a new MySQL Shell to interact with the cluster commands. MySQL's server team gives us a very good hands-on tutorial if you're interested in getting started with InnoDB Cluster. The tutorial includes steps to setup and deploy a local sandbox Cluster. It also provides steps to kill a node and shows how the MySQL Router handles the failure. One thing that was missing for me was how to rejoin the node to the cluster.

How to rejoin a node to the cluster?

If you followed the tutorial, you should come to the point where you killed the master node. Your cluster status would look like this: [code type="sql"] mysql-js> cluster.status() { "clusterName": "pythian", "defaultReplicaSet": { "status": "Cluster is NOT tolerant to any failures.", "topology": { "localhost:3320": { "address": "localhost:3320", "status": "ONLINE", "role": "HA", "mode": "R/W", "leaves": { "localhost:3330": { "address": "localhost:3330", "status": "ONLINE", "role": "HA", "mode": "R/O", "leaves": {} }, "localhost:3310": { "address": "localhost:3310", "status": "OFFLINE", "role": "HA", "mode": "R/O", "leaves": {} } } } } } } [/code] Keep in mind, it shows this because I'm connected from the MySQL Router. That appears to be working. So let's bring the failed node back online: [code type="sql"] mysql-js> dba.startLocalInstance(3310) The MySQL sandbox instance on this host in /home/vagrant/mysql-sandboxes/3310 will be started Starting MySQL instance... Instance localhost:3310 successfully started. [/code] Great! So is the node re-joined to the cluster? [code type="sql"] mysql-js> cluster.status() { "clusterName": "pythian", "defaultReplicaSet": { "status": "Cluster is NOT tolerant to any failures.", "topology": { "localhost:3320": { "address": "localhost:3320", "status": "ONLINE", "role": "HA", "mode": "R/W", "leaves": { "localhost:3330": { "address": "localhost:3330", "status": "ONLINE", "role": "HA", "mode": "R/O", "leaves": {} }, "localhost:3310": { "address": "localhost:3310", "status": "OFFLINE", "role": "HA", "mode": "R/O", "leaves": {} } } } } } } [/code] Nope. Also note that this cluster is not tolerant to any failures. After perusing the commands available to dba and cluster(which is an alias to dba.getCluster()), I gather that you need to do two steps to rejoin the node to the instance:
  • Validate the instance can join with dba.validateInstance().
  • Issue a rejoin command with cluster.rejoinInstance()
[code type="sql"] mysql-js> dba.validateInstance('root@localhost:3310') Please provide a password for 'root@localhost:3310': Validating instance... Running check command. Checking Group Replication prerequisites. * Comparing options compatibility with Group Replication... PASS Server configuration is compliant with the requirements. * Checking server version... PASS Server is 5.7.15 * Checking that server_id is unique... PASS The server_id is valid. * Checking compliance of existing tables... PASS The instance: localhost:3310 is valid for Cluster usage mysql-js> cluster.rejoinInstance('localhost:3310'); Please provide the password for 'localhost:3310': The instance will try rejoining the InnoDB cluster. Depending on the original problem that made the instance unavailable the rejoin, operation might not be successful and further manual steps will be needed to fix the underlying problem. Please monitor the output of the rejoin operation and take necessary action if the instance cannot rejoin. [/code] I was issuing this on a completely idle cluster, but the output above indicates that there may be manual steps involved to be able to rejoin the cluster. Most probably steps including how to rebuild a node in case of a real failure to avoid data discrepancies. Regardless, once you rejoin the node to the cluster, we can now tolerate another failure! [code type="sql"] mysql-js> cluster.status() { "clusterName": "pythian", "defaultReplicaSet": { "status": "Cluster tolerant to up to ONE failure.", "topology": { "localhost:3320": { "address": "localhost:3320", "status": "ONLINE", "role": "HA", "mode": "R/W", "leaves": { "localhost:3310": { "address": "localhost:3310", "status": "ONLINE", "role": "HA", "mode": "R/O", "leaves": {} }, "localhost:3330": { "address": "localhost:3330", "status": "ONLINE", "role": "HA", "mode": "R/O", "leaves": {} } } } } } } [/code]

Conclusion

MySQL's InnoDB Cluster is an interesting new Lab Technology. It encompasses three new tools or technologies:
  • Group Replication, which was announced a couple of years ago, and which might challenge Galera in the coming years.
  • The MySQL Router allows your application to be cluster aware, without requiring driver changes in your code.
  • The new MySQL Shell to interact with the cluster drivers. It has two modes: javascript and sql
MySQL Shell and MySQL Router are very reminiscent of the tools to maintain sharded MongoDB clusters. Regardless, now that we know the steps to rejoin a node to the cluster in the simplest of cases, it will be time to test out more complex failure scenarios.

No Comments Yet

Let us know what you think

Subscribe by email