Pythian Blog: Technical Track

Rounding Errors in MYSQL with DECIMAL Type

It seems to be assumed that fixed-point values, DECIMAL and NUMERIC, in MySQL are not susceptible to rounding errors because they are exact numeric data types. It must be kept it mind, that there are limits in precision, and the maximum number of digits for DECIMAL is 65. Additionally, a DECIMAL column can be assigned a precision or scale that could have the potential affect of truncation to the allowed number of digits. What I am discussing here is rounding error.

Take the salary data from the example employees database. Run two queries against the same data, one taking a SUM of the data in a subquery, another using GROUP BY WITH ROLLUP to calculate the same SUM.

mysql> select sum(hourly) from (
    -> select emp_no,year(from_date),
    -> sum(salary/(52*40)) as hourly
    -> from salaries-> group by emp_no,year(from_date)
    -> ) t;
+---------------+
| sum(hourly)   |
+---------------+
| 87250372.2511 |
+---------------+
1 row in set (47.34 sec)

mysql> select emp_no,year(from_date),sum(salary/(52*40))
    -> from salaries
    -> group by emp_no,year(from_date) with rollup;
+--------+-----------------+---------------------+
| emp_no | year(from_date) | sum(salary/(52*40)) |
+--------+-----------------+---------------------+
...
| 240754 | 1994            | 22.6274             |
| 240754 | 1995            | 23.9077             |
| 240754 | 1996            | 24.3221             |
| 240754 | 1997            | 26.2399             |
...
| 499998 | 1999            | 24.0245             |
| 499998 | 2000            | 24.6067             |
| 499998 | 2001            | 26.4438             |
| 499998 | NULL            | 201.9183            |
| 499999 | 1997            | 30.6284             |
| 499999 | 1998            | 32.2322             |
| 499999 | 1999            | 34.0120             |
| 499999 | 2000            | 35.7341             |
| 499999 | 2001            | 37.1649             |
| 499999 | NULL            | 169.7716            |
| NULL   | NULL            | 87250364.1424       |
+--------+-----------------+---------------------+
3142278 rows in set (8.76 sec)

According to the MySQL documentation, the calculations in the above queries are done using fixed-point exact value arithmetic. How is it that the SUM using a subquery and the GROUP BY WITH ROLLUP differ so much?
Can't Explain That

It’s certainly worth discussing MySQL floating-point approximate value types vs. the fixed-point exact value types and how conversions and calculations on each are handled. The floating-point (approximate value) types are FLOAT, REAL, and DOUBLE, and the fixed-point (exact value) types are INTEGER, SMALLINT, DECIMAL, and NUMERIC. Floating-point means the decimal point can be placed anywhere relative to the significant digits of the number with the actual position being indicated separately. A fixed-point value is an integer scaled by a specific factor.

MySQL 5.5 added support for precision math including the MySQL library for fixed-point arithmetic that allows operations to be handled in the same manner across different platforms and not dependent on the underlying C library. With this update, if no approximate values or strings are being used in a calculation, expressions are evaluated using DECIMAL exact value arithmetic with precision to 65 digits. For GROUP BY functions, STDDEV() and VARIANCE() return DOUBLE, an approximate floating-point type, and SUM() and AVG() return a DECIMAL for exact-value arguments and a DOUBLE for approximate value.

Type conversion in MySQL is handled using floating-point values. The results of type conversion may vary and can be affected by factors such as computer architecture or the compiler version or optimization level. One way to avoid these problems is to use an explicit CAST() rather than the implicit conversion.

So what does this mean and where will we encounter it in real world calculations?

There are certain naturally occurring ratios, such as the length of any circle’s diameter compared to its circumference, i.e. ?, that compute to an irrational number. An irrational number, as opposed to a rational number, is any real number that cannot be expressed as a fraction a/b, where a is an integer and b is a non-zero integer. Irrational numbers cannot be represented as terminating or repeating decimals. True irrational numbers will not be encountered in computing, only approximations thereof. An infinite continued fraction, such as 1/3, is another example in which the decimal value is approximated in computing. In 1949, John von Neumann et al. used ENIAC to compute 2037 digits of pi; a calculation that took 70 hours. Over the course of the 20th century, additional thousands of decimal places were obtained with the million-digit milestone in 1973. On August 6, 2010, Alexander J. Yee and Shigeru Kondo announced that they calculated pi to 5 trillion decimal places on a personal computer, double the previous record.

So back to the initial example, which value is more accurate and which is the correct way to obtain the SUM? The truth is, neither is exactly accurate. By using a little algebra, the query can be simplified. The idea being to work with whole numbers in as many initial steps as possible.


mysql> select sum(salary)/(52*40) from salaries;
+---------------------+
| sum(salary)/(52*40) |
+---------------------+
|       87250364.1438 |
+---------------------+
1 row in set (2.27 sec)

“The moon makes the tides change, so the oceans are werewolves. That makes sense.”

No Comments Yet

Let us know what you think

Subscribe by email