Pythian Blog: Technical Track

Once again about innodb-concurrency-tickets

I had to refresh my knowledge on how InnoDB threads queue works the other day when debugging activity spikes on one of the customer’s production system and while I had general idea about InnoDB kernel and queue, thread concurrency and queue join delays I didn’t have a complete model of how InnoDB concurrency control works. This is what I knew before I started investigation:

  1. You can limit the number of threads that allowed to be executed inside InnoDB kernel with innodb-thread-concurrency
  2. If all slots in kernel are occupied other threads have to wait in the queue
  3. Thread will sleep for time specified by innodb-thread-sleep-delay before entering the queue

I also knew that there is variable innodb-concurrency-tickets and that it allows a thread to enter the InnoDB kernel several times before it has to wait in the queue again. But how actually it happens? Why would thread need to leave the kernel? Is it related to transactions that run multiple queries? This is something I didn’t know.

So I started from manual:

When there is a limit on the number of threads, InnoDB reduces context switching overhead by permitting multiple requests made during the execution of a single SQL statement to enter InnoDB without observing the limit set by innodb_thread_concurrency. Since an SQL statement (such as a join) may comprise multiple row operations within InnoDB, InnoDB assigns “tickets” that allow a thread to be scheduled repeatedly with minimal overhead.”

This statement leads to an interesting conclusion that there may be cases when even a single query may have to enter InnoDB multiple time before it completes. As it turned out later this is happening much more often than one could imagine.

My further investigation brought me to old post by Vadim Tkachenko and a more detailed post by Ryan Lowe.
Both posts suggest that when a query enters InnoDB queue it doesn’t necessarily complete in one shot. There is a continuous communication between MySQL server layer and InnoDB storage engine during execution of a query and basically server layer sends a requests like “give me next row” to storage engine, receives a row, sends next request, etc. When thread returns row to server it has to leave the InnoDB kernel and it has to give up one of the innodb_concurrency_ticket it has to re-enter the kernel without having to wait in the queue. No tickets left? Sorry, you have to go back to the queue.

I suspected that this jumping in and out InnoDB kernel may cause a noticeable overhead in some cases, so I decided to run a simple test.

I used MySQL 5.5.18 on CentOS and ran several queries against sysbench sbtest database with ~9 million rows. Here is a single query result:

mysql [localhost] {msandbox} (test)  SET GLOBAL innodb_thread_concurrency=1;
Query OK, 0 rows affected (0.00 sec)
mysql [localhost] {msandbox} (test)  SHOW GLOBAL VARIABLES LIKE '%ticket%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| innodb_concurrency_tickets | 500   |
+----------------------------+-------+
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (test)  SELECT COUNT(*) FROM sbtest;
+----------+
| COUNT(*) |
+----------+
|  9028285 |
+----------+
1 row in set (16.23 sec)

Note, that I limited concurrency to 1, allowing only one thread to run in InnoDB kernel at any given point in time. This makes a manual test easier. innodb_concurrency_tickets is set to a default value of 500. This means that my query has to leave InnoDB kernel and go back to the queue 9028285/500 = 18056 times. In this case, nothing else is running, so query goes back to kernel with no delay.

Now let’s try to run 2 similar queries in parallel:
SESSION 1

mysql [localhost] {msandbox} (test) SELECT COUNT(*) FROM sbtest;
+----------+
| COUNT(*) |
+----------+
|  9028285 |
+----------+
1 row in set (1 min 35.50 sec)

And in a couple of seconds I start SESSION 2:

mysql [localhost] {msandbox} (test) SELECT COUNT(*) FROM sbtest;
+----------+
| COUNT(*) |
+----------+
|  9028285 |
+----------+
1 row in set (1 min 32.86 sec)

Note how execution time changed from 16 seconds to 1.5 minutes!
Why is this happening? Let’s take a look at a sample of SHOW ENGINE INNODB STATUS:
SAMPLE 1:

---TRANSACTION 90A, ACTIVE 2 sec fetching rows, thread declared inside InnoDB 383
mysql tables in use 1, locked 0
MySQL thread id 4, OS thread handle 0x499f5940, query id 52 localhost msandbox Sending data
SELECT COUNT(*) FROM sbtest
Trx read view will not see trx with id = 90B, sees  909
---TRANSACTION 909, ACTIVE 4 sec waiting in InnoDB queue
mysql tables in use 1, locked 0
MySQL thread id 3, OS thread handle 0x49a36940, query id 51 localhost msandbox Sending data
SELECT COUNT(*) FROM sbtest
Trx read view will not see trx with id = 90A, sees  90A

SAMPLE 2:

---TRANSACTION 90A, ACTIVE 5 sec sleeping before joining InnoDB queue
mysql tables in use 1, locked 0
MySQL thread id 4, OS thread handle 0x499f5940, query id 52 localhost msandbox Sending data
SELECT COUNT(*) FROM sbtest
Trx read view will not see trx with id = 90B, sees  909
<strong>---TRANSACTION 909, ACTIVE 7 sec fetching rows, thread </strong>declared inside InnoDB 61
mysql tables in use 1, locked 0
MySQL thread id 3, OS thread handle 0x49a36940, query id 51 localhost msandbox Sending data
SELECT COUNT(*) FROM sbtest
Trx read view will not see trx with id = 90A, sees  90A

The most interesting thing in these samples is that you can actually see how 2 transaction jump in and out of InnoDB kernel. In SAMPLE 1 transaction 90A is fetching rows, while in next sample it is already outside the InnoDB kernel and is waiting to join the queue. Same happens with transaction 909. In this case one query enters the kernel, scans 500 rows, exists the kernel (no “free” tickets left!) and is placed into the queue and my understanding is that it is placed at the end of queue. Then second query enters and the whole thing is repeated all over again.

Let’s see what happens if we give each query enough “free” tickets to complete without need to go back to the queue:
SESSION 1

SET GLOBAL innodb_concurrency_tickets=10000000;
Query OK, 0 rows affected (0.00 sec)
mysql [localhost] {msandbox} (test)  SELECT COUNT(*) FROM sbtest;
+----------+
| COUNT(*) |
+----------+
|  9028285 |
+----------+
1 row in set (16.14 sec)

SESSION 2

mysql [localhost] {msandbox} (test)  SELECT COUNT(*) FROM sbtest;
+----------+
| COUNT(*) |
+----------+
|  9028285 |
+----------+
1 row in set (30.57 sec)

In this case everything is straightforward: query in session 1 enters the kernel, executes completely in 16 seconds and goes away. Query in session 2 waits in the queue for about 16 seconds, then enters the kernel, executes and completes. And this sequential approach is much faster.

The same applies to range UPDATEs as well, but case with UPDATE is even worse, because it has to put exlusive lock and if it is kicked out of InnoDB kernel locks are obviously not released.

But don’t rush to increase innodb_concurrency_tickets. The default value of 500 makes sense if you think about it in this way: InnoDB allows fast queries (like unique key lookups) to enter the InnoDB kernel and complete even if there are several long running queries being executed at the same time. And this is a very typical workload pattern.

Looking into tuning this variable may be reasonable when you have a very limited number of similar queries that scan or update significant portions of data, like thousands of rows and there are many of them running concurrently. Obviously, if you set innodb_thread_concurrency=0 the number of tickets doesn’t play any role because all threads are allowed to enter InnoDB kernel.

No Comments Yet

Let us know what you think

Subscribe by email