Pythian Blog: Technical Track

Find and remove unused & duplicate indexes in MySQL

Overview:

It is a general rule of thumb, that the more indexes you have on a table, the slower the INSERTs, UPDATEs, and DELETEs operations become.

Indexes in MySQL—or for that matter, any database—are never static. Each time you modify a table, be it an update, delete or insert operation, the relevant indexes are updated by MySQL. If they are not, they will be useless in the next search query that would need them. This operation causes an extra overhead on the system, especially when those indexes are not used by the tables, but are still being updated every time.

Therefore, adding indexes shouldn’t be taken lightly, as it is a performance trade-off that must be handled very carefully. The more indexes you add to a table, the slower INSERT, UPDATE and DELETE statements become, but on the other hand, SELECT operations are optimized using those indexes. So this needs to be balanced very carefully.

This blog aims to help you find and remove unused, redundant or duplicate indexes from your MySQL database and enhance the schema and overall performance of the system.

 

Unused Indexes

MySQL keeps track of every index usage as part of its PERFORMANCE SCHEMA. This data can be fetched using the schema_unused_indexes view, which will provide a list of those indexes for which there have been no events, meaning these indexes might be redundant and unused for a while, to be specific, unused since the last restart of the MySQL database server. For e.g check the last MySQL server restart using ‘\s’, and if that period is long, the indexes you get using the below query have been unused for that many days. The longer the time since the restart, the more reliable these results are.

There are different ways to determine unused indexes. 

  1. The first is to use the sys.schema_unused_indexes table.

This is a view that displays indexes for which there are no events, which indicates that they are not being used since the last server restart. 

The schema_unused_indexes view has these columns:

  • object_schema – The schema name
  • object_name – The table name
  • index_name – The unused index name
select * from sys.schema_unused_indexes where index_name not like 'fk_%'  and object_schema not in ( 'performance_schema', 'mysql' , 'information_schema');
+---------------+-------------+------------+

| object_schema | object_name | index_name | 

+---------------+-------------+------------+

| mydb          | age         | age_idx    |

| mydb          | name        | name_idx   |

| test          | p_id        | p_idx      |

+---------------+-------------+------------+

    2. Another approach for finding unused indexes would be to use Percona tool pt-index-usage

Percona Toolkit has a tool that checks the usage of our index from the slow query log. Pt-index-usage reads the slow query log and executes every query with EXPLAIN to ask MySQL which indexes it would use. At the end of the process, it will provide you with an output with a list of unused indexes. Please note that the slow query log needs to be enabled on your server, before using this command.

pt-index-usage /path/to/slow.log --host localhost

In the case of unique indexes listed in this output, some precautions and confirmation need to be taken, because those indexes may not be used for index lookups. Still, your application may need them to avoid duplicates in the column. Always make sure before dropping such unique indexes.

 

Duplicate Indexes

Another factor related to indexes that hampers database performance is duplicate indexes.

Having duplicate keys in your schemas can hurt the performance of the database, as they make the optimizer slower because MySQL needs to examine more query plans. Also, the storage engine needs to maintain, calculate and update more index statistics as mentioned earlier in the case of unused indexes.

  1. The first approach to finding duplicate indexes is using INFORMATION_SCHEMA.STATISTICS table as below:
SELECT s.INDEXED_COL,GROUP_CONCAT(INDEX_NAME) FROM (SELECT INDEX_NAME, GROUP_CONCAT (CONCAT (TABLE_NAME,'.',COLUMN_NAME) ORDER BY CONCAT (SEQ_IN_INDEX,'COLUMN_NAME')) 'INDEXED_COL' FROM INFORMATION_SCHEMA.STATISTICS GROUP BY INDEX_NAME)as s GROUP BY INDEXED_COL HAVING COUNT(1)>1;

     2. Another method is using Percona tool pt-duplicate-key-checker

This tool examines MySQL tables for duplicate or redundant indexes and foreign keys. Connection options are read from MySQL option files.

pt-duplicate-key-checker --host hostname

This tool gives us the reason why the keys are duplicated and provides us with the SQL command to resolve the issue, e.g as below:

ALTER TABLE db.table DROP INDEX index_name;

How to remove or drop these indexes:

Once the list of unused/duplicate indexes has been received, and it is confirmed that those are good to be dropped, they can be dropped using the below command:

DROP INDEX ‘index_name’ on table;

Please note that the drop index statement is mapped to an ALTER TABLE statement to drop the index. The same can be done using the below command as well:

ALTER TABLE table_name DROP INDEX index_name;

Starting with MySQL 5.6, dropping an index can be done using online DDL operations.

Of course, you will not need to drop all such indexes, but these results and approaches always give you a fair idea of which indexes need evaluation.

Conclusion

In a gist, indexes are good for our queries but too many indexes or duplicate or redundant indexes can surely affect the performance of a system. It is always advisable to check and review the schemas periodically to catch such duplicate/unused indexes, which will help to maintain better overall system performance. 

Also before dropping any indexes on the production environment, it is highly recommended to test them on a testing environment and measure the performance of your queries.

No Comments Yet

Let us know what you think

Subscribe by email