Pythian Blog: Technical Track

Using ProxySQL to validate MySQL updates

A quick look at ProxySQL

There is a lot of buzz in the MySQL community around ProxySQL, an open-source SQL-aware proxy. I was lucky enough to give a ProxySQL Tutorial at PerconaLive Amsterdam 2016 with the creator of ProxySQL, René Cannaò. Some of ProxySQL's features include:
  • Query rules based on Google's RE2-style regex
  • Failover detection
  • Connection multiplexing
ProxySQL's approach to implementing regex-style query rules opens the door to some pretty fascinating possibilities. Just to name a few examples:
  • Read/write splitting between the write master and read slaves
  • Query firewall, for those times you wish you could prevent queries from hitting the database. Perhaps you are susceptible to SQL Injection, but don't have the ability to change the apps right away.
  • Sharding strategies based on query patterns. For example, adding comments to the SQL to indicate which schema or server the query should go to.
  • Validating query changes without impacting the production server
That last item is possible due to a feature in ProxySQL called mirroring.

Mirroring in ProxySQL

Like most of the features of ProxySQL, the implementation of mirroring in ProxySQL is meant to be very simplistic. If a query comes into ProxySQL and matches a query rule that enables the mirroring feature, the ProxySQL thread is cloned and inherits all attributes of the original thread. Then, the cloned thread is modified according to the rule or any following rules. Perhaps you rewrite a query or tell it to be sent to a different hostgroup. Hostgroups, by the way, are ProxySQL's method to define logical groupings of backend database servers. This can be 'prod', 'slaves', 'reporting'; anything at all. In this example, ProxySQL sends all production traffic from the Application to the prod server running MySQL 5.6, as well as to a test server running MySQL 5.7: [caption id="attachment_94395" align="alignnone" width="867"] ProxySQL Mirroring ProxySQL Mirroring[/caption] The green arrow represents live application traffic that is sent to the production database server, which sends all responses back to the application as if nothing changed. ProxySQL issues the mirrored query, represented by the blue arrow, independently without impact to the application, collecting metrics and errors for both queries. The query rule for the above example is simple:
mysql> SELECT username, destination_hostgroup, mirror_hostgroup FROM mysql_query_rules 
 WHERE username='app_user'\G
 ****************** 1. row **********************
  username: app_user
 destination_hostgroup: 3
  mirror_hostgroup: 4
 
All queries from app_user are executed against hostgroup=3 which is running MySQL 5.6. ProxySQL will also send queries from our app_user to the mirror_hostgroup=4, which is running MySQL 5.7.

Validate application queries for MySQL Upgrade

When upgrading your MySQL databases to a new major version, there are various things that you need to check:
  • All queries succeed. For example, new MySQL versions often introduce new RESERVED words that can break queries unless they are quoted.
  • Individual query performance.
  • Queries return the right results.
  • Overall database performance under real-life application traffic.
Each item is progressively harder to test adequately. Let's see how the previous mirror rule allows us to test query success. In MySQL 5.7, a new reserved word 'GENERATED' was added. If you are unlucky enough to have email leads that track when the lead was generated, your email_leads table might look like this:
CREATE TABLE email_lead (
  `id` INT unsigned PRIMARY KEY NOT NULL AUTO_INCREMENT,
  ....
  `generated` datetime NOT NULL
 )
 
And of course you might insert into that table like this:
INSERT INTO email_lead (generated) VALUES (NOW());
 
On MySQL 5.7, that insert would lead to the following error:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your
 MySQL server version for the right syntax to use near 'generated) VALUES (NOW())' at line 1
 
But let's assume we are routing queries through ProxySQL. On ProxySQL, we issue the insert:
mysql>INSERT INTO email_lead (generated) VALUES (NOW());
 Query OK, 1 row affected (0.02 sec)
 
On Prod, we validate the insert succeeded
mysq> SELECT * FROM email_lead\G
 *************************** 1. row ***************************
  id: 1
 generated: 2016-10-19 20:23:41
 1 row in set (0.00 sec)
 
But on the test 5.7 instance, there is no entry
mysql> SELECT * FROM email_lead\G
 Empty set (0.00 sec)
 
We can check ProxySQL admin interface for the rewrite statistics
mysql> SELECT hostgroup, username, digest_text, count_star, sum_time FROM stats_mysql_query_digest 
 WHERE digest_text LIKE 'INSERT%'\G
 *************************** 1. row ***************************
  hostgroup: 3
  username: app_user
 digest_text: INSERT INTO email_lead (generated) VALUES (NOW())
  count_star: 1
  sum_time: 11412
 *************************** 2. row ***************************
  hostgroup: 4
  username: app_user
 digest_text: INSERT INTO email_lead (generated) VALUES (NOW())
  count_star: 1
  sum_time: 1130
 2 rows in set (0.00 sec)
 
The query written to hostgroup 4, the 5.7 instance, is much quicker because it failed! We can find the error written in ProxySQL's error log:
MySQL_Session.cpp:1207:handler(): [WARNING] Error during query on (4,mysqlb,3306): 1064, You have an error
 in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax 
 to use near 'generated) VALUES (NOW())' at line 1
 

Conclusion

Using ProxySQL's mirror feature allows us to send real-life application workload to a completely separate server without touching the application. You can plug in your upgrade server into your historical trending tool of choice, and have it ingest ProxySQL query metrics and error logs. This will allow you to validate upgrades using real-life traffic for query success, query performance, and overall server performance. Imagine no longer having to write custom application-specific benchmarking tools to validate your upgrades! One caveat is that currently, ProxySQL does not support validating query results. I have highlighted upgrade validation in this post, but mirroring can also be used for several other use-cases: testing query rewrites, event logging, and even slave prefetch. The obvious question needing an answer is: What is the performance impact to the production application by enabling mirroring in ProxySQL? I will leave this for a separate blog post.

No Comments Yet

Let us know what you think

Subscribe by email