Pythian Blog: Technical Track

InnoDB error: Total number of locks exceeds the lock table size

Recently, one of our customers had a problem with one of their replication slaves where a sql statement gave an error while executing on the slave which was executed successfully on the Master server. The slave stopped with the error as below.
 mysql> show slave status\G
 *************************** 1. row ***************************
 Slave_IO_State: Waiting for master to send event
 Master_Host: reporting.com
 Master_User: replication
 Master_Port: 3306
 Connect_Retry: 10
 Master_Log_File: mysql3308-bin-log.000488
 Read_Master_Log_Pos: 340658086
 Relay_Log_File: mysql3319-relay-log.000156
 Relay_Log_Pos: 4983506
 Relay_Master_Log_File: mysql3308-bin-log.000462
 Slave_IO_Running: Yes
 Slave_SQL_Running: No
 Replicate_Do_DB:
 Replicate_Ignore_DB:
 Replicate_Do_Table:
 Replicate_Ignore_Table:
 Replicate_Wild_Do_Table:
 Replicate_Wild_Ignore_Table: 
 Last_Errno: 1206
 Last_Error: Error 'The total number of locks exceeds the lock table size' on query. Default database: 'production'. Query: 'create table Summary SELECT Dimension.calendarYearMonth, count(*) as EmailsSent, count(distinct Email_Sent.UserID) as "Unique Recipients"
 FROM Dimension, Email_Sent
 WHERE
 Email_Sent.DateID = Dimension.id
 AND Dimension.date >= '2010-11-01'
 GROUP BY 1'
 Skip_Counter: 0
 Exec_Master_Log_Pos: 4983353
 Relay_Log_Space: 14212286599
 Until_Condition: None
 Until_Log_File:
 Until_Log_Pos: 0
 Master_SSL_Allowed: No
 Master_SSL_CA_File:
 Master_SSL_CA_Path:
 Master_SSL_Cert:
 Master_SSL_Cipher:
 Master_SSL_Key:
 Seconds_Behind_Master: NULL
 Master_SSL_Verify_Server_Cert: No
 Last_IO_Errno: 0
 Last_IO_Error:
 Last_SQL_Errno: 1206
 Last_SQL_Error: Error 'The total number of locks exceeds the lock table size' on query. Default database: 'production'. Query: 'create table Summary SELECT Dimension.calendarYearMonth, count(*) as EmailsSent, count(distinct Email_Sent.UserID) as "Unique Recipients"
 FROM Dimension, Email_Sent
 WHERE
 Email_Sent.DateID = Dimension.id
 AND Dimension.date >= '2010-11-01'
 GROUP BY 1'
 1 row in set (0.00 sec)
I started the slave and monitored the query running status using "show engine innodb status\G". According to the EXPLAIN plan of the query, it is estimated to scan and lock about 910 million rows. The InnoDB status below showed that it already locked 677 million rows and the slave stopped again with the same error. This is an error specific to InnoDB tables. The query mentioned above attempts to set locks on all records scanned, to ensure the data returned doesn’t change before creation of the new table. These locks are put into an area of memory allocated with in innodb_buffer_pool_size variable.  
 ---TRANSACTION 5722EE5D, ACTIVE 18100 sec, process no 13052, OS thread id 1222195536 fetching rows
 mysql tables in use 2, locked 2
 2653776 lock struct(s), heap size 327759856, 677089582 row lock(s)
 MySQL thread id 2860, query id 1863366 Copying to tmp table on disk
 create table Summary SELECT Dimension.calendarYearMonth, count(*) as EmailsSent, count(distinct Email_Sent.UserID) as "Unique Recipients"
 FROM Dimension, Email_Sent
 WHERE
 Email_Sent.DateID = Dimension.id
 AND Dimension.date >= '2010-11-01'
 GROUP BY 1
 TABLE LOCK table `production`.`Email_Sent` trx id 5722EE5D lock mode IS
 RECORD LOCKS space id 5623 page no 11 n bits 440 index `ClientID` of table `production`.`Email_Sent` trx id 5722EE5D lock mode S
 TABLE LOCK table `production`.`Date_Dimension` trx id 5722EE5D lock mode IS
 RECORD LOCKS space id 5611 page no 34 n bits 240 index `PRIMARY` of table `production`.`Date_Dimension` trx id 5722EE5D lock mode S locks rec but not gap
 RECORD LOCKS space id 5623 page no 12 n bits 440 index `ClientID` of table `production`.`Email_Sent` trx id 5722EE5D lock mode S
 RECORD LOCKS space id 5623 page no 15 n bits 272 index `ClientID` of table `production`.`Email_Sent` trx id 5722EE5D lock mode S
 RECORD LOCKS space id 5623 page no 18 n bits 440 index `ClientID` of table `production`.`Email_Sent` trx id 5722EE5D lock mode S
 RECORD LOCKS space id 5623 page no 21 n bits 224 index `ClientID` of table `production`.`Email_Sent` trx id 5722EE5D lock mode S
 RECORD LOCKS space id 5623 page no 29 n bits 440 index `ClientID` of table `production`.`Email_Sent` trx id 5722EE5D lock mode S
 RECORD LOCKS space id 5623 page no 31 n bits 440 index `ClientID` of table `production`.`Email_Sent` trx id 5722EE5D lock mode S
 TOO LOCKS PRINTED FOR THIS TRX: SUPPRESSING FURTHER PRINTS
We found the main difference in MySQL configuration between Master and Slave was amount of memory allocated to innodb_buffer_pool_size because Slave was running several MySQL instances that were used for backup purposes. We need to increase the memory of innodb_buffer_pool_size variable where the locks table can fit in the configuration file and restart the database. This should fix the problem. InnoDB uses buffer pool to cache data and indexes of its tables, but it also uses it to store system information, like the locks table. The query set locks on all the records it scans. If we have a small buffer pool size, then the InnoDB lock table may indeed grow so big that it does not fit in the buffer pool. The locks table size for the 677 million rows was 327M according to the InnoDB status. So, we have increased the innodb_buffer_pool_size from 500MB to 1GB which was enough to hold all the locks and then restarted the database server. Now the SQL query completed successfully on the slave. More details on the InnoDB row level locks and the size it takes can be found here.

No Comments Yet

Let us know what you think

Subscribe by email