Pythian Blog: Technical Track

How to find unused indexes and drop them safely in MongoDB

Indexes increase read performance and allow queries to access data faster. But they also need to be updated when the indexed keys are created or modified, which consumes additional time to complete C*UD operations. In this blog post, we will discuss how to find unused indexes and safely remove them.

Querying index usage stats

In version 3.2, MongoDB introduced new instrumentation that tracks index usage. $indexStat is the new aggregation stage which gives information on index usage stats. Let's take a look at the sample output below.

MongoDB Enterprise > db.idxsts.aggregate({$indexStats:{}}).pretty()
{
  "name" : "name_1_grade_1",
  "key" : {
  "name" : 1,
  "grade" : 1
  },
  "host" : "DarshanMBP:30008",
  "accesses" : {
  "ops" : NumberLong(1),
  "since" : ISODate("2018-07-25T10:08:06.109Z")
  }
 }
 {
  "name" : "name_1_slno_1",
  "key" : {
  "name" : 1,
  "slno" : 1
  },
  "host" : "DarshanMBP:30008",
  "accesses" : {
  "ops" : NumberLong(4),
  "since" : ISODate("2018-07-25T10:12:59.106Z")
  }
 }
 {
  "name" : "_id_",
  "key" : {
  "_id" : 1
  },
  "host" : "DarshanMBP:30008",
  "accesses" : {
  "ops" : NumberLong(0),
  "since" : ISODate("2018-07-25T09:29:00.245Z")
  }
 }
 MongoDB Enterprise > 

We can see that there are three indexes for the idxsts collection, Let's dive deeper into the output. Consider name_1_slno_1 Index

{  "name" : "name_1_slno_1",  "key" : {  "name" : 1,  "slno" : 1  },  "host" : "DarshanMBP:30008",  "accesses" : {  "ops" : NumberLong(4),  "since" : ISODate("2018-07-25T10:12:59.106Z")  } }

The first part of the document shows basic information about the index and the MongoDB server. The key we are interested in is accesses, which shows how many times the index was used since the server was started. The since field tells you when MongoDB started gathering statistics about this index. If accesses.ops is 0, it means that the index was never used since MongoDB was restarted or since the index was created.

Source: https://docs.mongodb.com/manual/reference/operator/aggregation/indexStats/

Retrieving unused indexes for all databases

As $indexStats is an aggregation stage, it needs to be executed individually for every collection. The script below will reduce the tediousness.

db.getMongo().getDBNames().forEach(function (dbname) {  if (dbname != "admin") {  db.getSiblingDB(dbname).getCollectionNames().forEach(function (cname) {  output = db.getSiblingDB(dbname)[cname].aggregate({$indexStats:{} });  output.forEach(function(findUnused) {  if (findUnused.accesses.ops == 0 && findUnused.name != "_id_") {  print(dbname + " \t" + cname + " \t" + JSON.stringify(findUnused) );  }  })  }) }}) 

The following output is returned:

mjour  marks  {"name":"grade_1_name_-1","key":{"grade":1,"name":-1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:25:59.949Z"}}
 mjour  marks  {"name":"grade_1","key":{"grade":1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:25:53.414Z"}}
 mjour  marks  {"name":"std_id_1_name_-1","key":{"std_id":1,"name":-1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:26:09.201Z"}}
 mjour  petrol  {"name":"shop_id_1","key":{"shop_id":1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:26:35.958Z"}}
 njour  idxsts  {"name":"grade_1","key":{"grade":1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:33:45.822Z"}}
 

The script will query the index stats for each collection in each database and report those which were never used (accesses.ops = 0). Keep in mind that the uptime should be long enough to have all different queries executed at least once. Now that we have all unused indexes, we can look at dropping them.

Dropping unused indexes

Before you drop the indexes make sure that:

1. The instance was running long enough for the applications to run all possible queries and the query pattern no longer requires an index. Uptime can be easily retrieved using db.serverStatus().uptime.

2. You have a backup of all affected collections.

3. Indexes are also not being used on SECONDARY nodes.

4. A write lock will be acquired for the database the index belongs to, blocking all operations.

Dropping an index in a production environment can be done using one of the 2 methods below:

a. Drop the index on the primary, which will replicate overall secondaries. This is only recommended for very small indexes and non-busy databases.

b. Drop

No Comments Yet

Let us know what you think

Subscribe by email