Pythian Blog: Technical Track

Orientation to Cassandra Nodetool

Nodetool is a broadly useful tool for managing Cassandra clusters. A large percentage of questions concerning Cassandra can easily be answered with a nodetool function. Having been developed over time by a diverse open source community, the nodetool commands can seem at first glance to be defined within a minimally consistent syntax. On closer inspection, the individual commands can be organized into several overlapping buckets. The first grouping consists of commands to view ( get) or change ( set) configuration variables. An example pair is getlogginglevels and setlogginglevel. By default, logging is set to INFO, midway in the available range of ALL, TRACE, DEBUG, INFO, WARN, ERROR and OFF. Running nodetool getlogginglevels will display the currently set value. Other get/set (sometimes prefixed as enable/ disable) commands can be set either at startup or while Cassandra is running. For example, incremental backups can be enabled in the startup configuration file cassandra.yaml by setting incremental_backups=true. Alternatively, they can be started or stopped using nodetool, with the commands nodetool enablebackup and nodetool disablebackup. In general, though, most configuration values are either set in startup configuration files or set dynamically using nodetool; there is little overlap. Several nodetool commands can be used to get insight into status of the Cassandra node, cluster, or even data. Two very basic informational commands are nodetool status and nodetool info. Nodetool status provides a brief output of node state (up, down, joining cluster, etc.), IP addresses and datacenter location. Nodetool info provides a less brief output of key status variables. It is a convenient way to see memory utilization, for example. Although the tool is named nodetool, not all commands apply to nodes. For example, nodetool describecluster provides information about the cluster -- snitch and partitioner type, name and schema versions. For another example, nodetool netstats provides information about communication among nodes. The nodetool can not only be used for basic configuration and information; it is also a powerful tool for cluster operations and data management. The operations tasks of shutting down a node within a cluster or doing maintenance on a live node are made easier with commands like nodetool drain (flushes writes from memory to disk, shuts off connections, replays commitlog) and nodetool disablegossip (makes node invisible to the cluster). Data management tasks are made easier with commands like nodetool repair to sync data among nodes (perhaps due to missed writes across the cluster) and nodetool garbagecollect to remove deleted data. Now that I have provided an orientation to nodetool, in future posts I will describe how to combine various information, set/get and management commands to do common tasks such as backups, performance tuning and upgrades.
Learn more about Pythian services for Cassandra.

No Comments Yet

Let us know what you think

Subscribe by email