Pythian Blog: Technical Track

Benchmark: TokuDB vs. MariaDB / MySQL InnoDB compression

As the amount of data companies are interested in collecting grows, life becomes all the more difficult for IT staff at all levels within an organization. SAS Enterprise storage devices that were once considered giants are now being phased out in favor of SSD Arrays with features such as de-duplication, tape storage has pretty much been abandoned and the same goes without saying for database engines.

For many customers just storing data is not enough because of the CAPEX and OPEX that is involved, smarter ways of storing the same data are required and since databases generally account for the greatest portion of storage requirements across an application stack. Lately they are used not only for storing data but also for storing logs in many cases. IT managers, developers and system administrators very often turn to the DBA and pose the time old question “is there a way we can cut down on the space the database is taking up?” and this question seems to be asked all the more frequently as time goes by.

This is a dilemma that cannot easily be solved for a MySQL DBA. What would the best way to resolve this issue be? Should I cut down on binary logging? Hmm… I need the binary logs in case I need to track down the transactions that have been executed and perform point in time recovery. Perhaps I should have a look at archiving data to disk and then compress this using tar and gzip? Heck if I do that I’ll have to manage and track multiple files and perform countless imports to re-generate the dataset when a report is needed from historical data. Maybe, just maybe, I should look into compressing the data files? This seems like a good idea… that way I can keep all my data, and I can just take advantage of a few extra CPU cycles to keep my data to a reasonable size – or does it?

Inspired by the time old dilemma I decided to take the latest version of TokuDB for test run and compare it to InnoDB compression which has been around a while. Both technologies promise a great reduction in disk usage and even performance benefits – naturally if data resides on a smaller portion of the disk access time and seek time will decrease, however this isn’t applicable to SSD disks that are generally used in the industry today. So I put together a test system using an HP Gen8 Proliant Server with 4x Intel® Xeon® E3 Processors, 4GB ECC RAM & the Samsung EVO SATA III SSD rated at 6G/s and installed the latest version of Ubuntu 14.04 to run some benchmarks. I used the standard innodb-heavy configuration from the support-files directory adding one change – innodb_file_per_table = ON. The reason for this is that TokuDB will not compress the shared tablespace hence this would affect the results of the benchmarks. To be objective I ran the benchmarks both on MySQL and MariaDB using 5.5.38 which is the latest bundled version for TokuDB.

The databases were benchmarked for speed and also for the space consumed by the tpcc-mysql dataset generated with 20 warehouses. So lets first have a look at how much space was needed by TokuDB vs. InnoDB (using both compressed and uncompressed tables):

 

Configuration GB
TokuDB  2,7
InnoDB Compressed Tables  4,2
InnoDB Regular Tables  4,8

 

TokuDB was a clear winner here, of course the space savings depend on the type of data stored in the database however with the same dataset it seems TokuDB is in the lead. Seeing such a gain in storage requirements of course will make you wonder how much overhead is incurred in reading and writing this data, so lets have a look at the “tpm-C” to understand how many orders can be processed per minute on each. Here I have also included results for MariaDB vs. MySQL. The first graph shows the amount of orders that were processed per 10 second interval and the second graph shows the total “tpm-C” after the tests were run for 120 seconds:

 

Toku_Maria_MySQL

Figure 1 – Orders processed @ 10 sec interval

 

Interval MariaDB 5.5.38 MariaDB 5.5.38 InnoDB Compressed TokuDB on MariaDB 5.5.38 MySQL 5.5.38 MySQL 5.5.38 InnoDB Compressed TokuDB on MySQL 5.5.38
10 5300 529 5140 5667 83 5477
20 5743 590 5112 5513 767 5935
30 5322 596 4784 5267 792 5931
40 4536 616 4215 5627 774 6107
50 5206 724 5472 5770 489 6020
60 5827 584 5527 5956 402 6211
70 5588 464 5450 6061 761 5999
80 5679 424 5474 5775 789 6029
90 5759 649 5490 6258 788 5998
100 5288 611 5584 6044 765 6026
110 4637 575 4948 5753 720 5314
120 3696 512 4459 930 472 292

Toku_Maria_MySQL_2

Figure 2 – “tpm-C” for 120 test run

MySQL Edition “tpm-C”
TokuDB on MySQL 5.5.38 32669.5
MySQL 5.5.38 32310.5
MariaDB 5.5.38 31290.5
TokuDB on MariaDB 5.5.38 30827.5
MySQL 5.5.38 InnoDB Compressed Tables 4151
MariaDB 5.5.38 InnoDB Compressed Tables 3437

 

Surprisingly enough however, the InnoDB table compression results were very low – perhaps this may have shown better results on regular SAS / SATA disks with traditional rotating disks. The impact on performance was incredibly high and the savings on disk space were marginal compared to those of TokuDB so once again again it seems we have a clear winner! TokuDB on MySQL outperformed both MySQL and MariaDB with uncompressed tables. The findings are interesting because in previous benchmarks for older versions of MariaDB and MySQL, MariaDB would generally outperform MySQL however there are many factors should be considered.

These tests were performed on Ubuntu 14.04 while the previous tests I mentioned were performed on CentOS 6.5 and also the hardware was slightly different (Corsair SSD 128GB vs. Samsung EVO 256GB). Please keep in mind these benchmarks reflect the performance on a specific configurations and there are many factors that should be considered when choosing the MySQL / MariaDB edition to use in production.

As per this benchmark, the results for TokuDB were nothing less than impressive and it will be very interesting to see the results on the newer versions of MySQL (5.6) and MariaDB (10)!

No Comments Yet

Let us know what you think

Subscribe by email