Pythian Blog: Technical Track

Expanding the Couchbase Collector for Diamond

The code

For the impatient ones, the couchbase collector can be found in github: Couchbase Collector Follow the instructions in the README file to get it installed under your diamond!

Intro

If you have been involved with metric collections at any point you may have heard of BrightCove's Diamond. Diamond is literally a little piece of diamond regarding metrics collection. With its modular architecture it is possible to collect metrics from a large collection of operating system and software components. In addition to that, it is able to ship those metrics to a diverse range of trending software including Graphite, RRD or anything that supports StatsD. While recently working with Couchbase, I needed to collect and ship metrics using Diamond; a github project was brought to my attention doing exactly that. Unfortunately the author zooldk, has only one entry in the commit history listed as "Initial skeleton of collector" and the only statistic currently collected is itemCount from basicStats. Luckily the python code is quite simple and straightforward, so I went ahead and extended it. First let's have an overview of the metrics you can expect to see in Graphite after installing the collector.

What did we achieve?

The target is to graph, as many of the useful Couchbase metrics as possible. After installing the extended Couchbase Collector this is what we can expect to see in Graphite: Graphite_Couchbase_Tree Here is a plot of memory used by Couchbase on my (memory starved) vm: Graphite_basicstats_memused

A bit of theory: (Data) Buckets and Couchbase cluster metrics

Couchbase is a NoSQL database using JSON for Documents. It is highly scalable and very easy to create a cluster. For the sake of working on extending the above mentioned metrics collector, I installed the Couchbase server, community edition on two VMs. My VMs have IP addresses: 192.168.60.100 and 192.168.60.101. I mostly used the default parameters in the setup and installed both demo databases "beersample" and "gamesim-sample". My Couchbase user interface now looks like: couchbase_basic_installation

Metrics in Couchbase

Collecting metrics from Couchbase Buckets is as simple as executing a GET request: For example: https://192.168.60.100:8091/pools/default/buckets/beersample
$ curl -s https://192.168.60.100:8091/pools/default/buckets/beer-sample
 {"name":"beer-sample","bucketType":"membase","authType":"sasl","saslPassword":"","proxyPort":0,"replicaIndex":false,"uri":"/pools/default/buckets/beer-sample?bucket_uuid=3a088dd60672ce16aea01c738ec96928","streamingUri":"/pools/default/bucketsStreaming/beer-sample?bucket_uuid=3a088dd60672ce16aea01c738ec96928","localRandomKeyUri":"/pools/default/buckets/beer-sample/localRandomKey","controllers":{"compactAll":"/pools/default/buckets/beer-sample/controller/compactBucket","compactDB":"/pools/default/buckets/default/controller/compactDatabases","purgeDeletes":"/pools/default/buckets/beer-sample/controller/unsafePurgeBucket","startRecovery":"/pools/default/buckets/beer-sample/controller/startRecovery"},"nodes":[{"couchApiBase":"https://192.168.60.100:8092/beer-sample","systemStats":{"cpu_utilization_rate":16.831683168316832,"swap_total":855629824,"swap_used":112218112,"mem_total":1968685056,"mem_free":934641664},"interestingStats":{"cmd_get":0.0,"couch_docs_actual_disk_size":138325417,"couch_docs_data_size":137479323,"couch_views_actual_disk_size":637700,"couch_views_data_size":616830,"curr_items":7888,"curr_items_tot":7889,"ep_bg_fetched":0.0,"get_hits":0.0,"mem_used":99496472,"ops":0.0,"vb_replica_curr_items":1},"uptime":"352954","memoryTotal":1968685056,"memoryFree":934641664,"mcdMemoryReserved":1501,"mcdMemoryAllocated":1501,"replication":0.0,"clusterMembership":"active","status":"healthy","otpNode":"ns_1@192.168.60.100","thisNode":true,"hostname":"192.168.60.100:8091","clusterCompatibility":131072,"version":"2.2.0-837-rel-community","os":"x86_64-unknown-linux-gnu","ports":{"proxy":11211,"direct":11210}}],"stats":{"uri":"/pools/default/buckets/beer-sample/stats","directoryURI":"/pools/default/buckets/beer-sample/statsDirectory","nodeStatsListURI":"/pools/default/buckets/beer-sample/nodes"},"ddocs":{"uri":"/pools/default/buckets/beer-sample/ddocs"},"nodeLocator":"vbucket","fastWarmupSettings":false,"autoCompactionSettings":false,"uuid":"3a088dd60672ce16aea01c738ec96928","vBucketServerMap":{"hashAlgorithm":"CRC","numReplicas":1,"serverList":["192.168.60.100:11210"],"vBucketMap},"replicaNumber":1,"threadsNumber":3,"quota":{"ram":104857600,"rawRAM":104857600},"basicStats":{"quotaPercentUsed":33.76667785644531,"opsPerSec":0.0,"diskFetches":0.0,"itemCount":7303,"diskUsed":50731634,"dataUsed":49454080,"memUsed":35406928},"bucketCapabilitiesVer":"","bucketCapabilities":["touch","couchapi"]}
Now this is not very readable so let's reformat it using Python's JSON library. I am only pasting the output that is useful for metric collection.
$ curl -s https://192.168.60.100:8091/pools/default/buckets/beer-sample | python -mjson.tool
 {
  ...
 "basicStats": {
 "dataUsed": 49454080,
 "diskFetches": 0.0,
 "diskUsed": 50731634,
 "itemCount": 7303,
 "memUsed": 35406928,
 "opsPerSec": 0.0,
 "quotaPercentUsed": 33.76667785644531
 },
 "name": "beer-sample",
 "nodes": [
 {
 "clusterCompatibility": 131072,
 "clusterMembership": "active",
 "couchApiBase": "https://192.168.60.100:8092/beer-sample",
 "hostname": "192.168.60.100:8091",
 "interestingStats": {
 "cmd_get": 0.0,
 "couch_docs_actual_disk_size": 138325417,
 "couch_docs_data_size": 137479323,
 "couch_views_actual_disk_size": 637700,
 "couch_views_data_size": 616830,
 "curr_items": 7888,
 "curr_items_tot": 7889,
 "ep_bg_fetched": 0.0,
 "get_hits": 0.0,
 "mem_used": 99496472,
 "ops": 0.0,
 "vb_replica_curr_items": 1
 },
 "mcdMemoryAllocated": 1501,
 "mcdMemoryReserved": 1501,
 "memoryFree": 932651008,
 "memoryTotal": 1968685056,
 "os": "x86_64-unknown-linux-gnu",
 "otpNode": "ns_1@192.168.60.100",
 "ports": {
 "direct": 11210,
 "proxy": 11211
 },
 "replication": 0.0,
 "status": "healthy",
 "systemStats": {
 "cpu_utilization_rate": 18.0,
 "mem_free": 932651008,
 "mem_total": 1968685056,
 "swap_total": 855629824,
 "swap_used": 112218112
 },
 "thisNode": true,
 "uptime": "353144",
 "version": "2.2.0-837-rel-community"
 }
 ],
 "quota": {
 "ram": 104857600,
 "rawRAM": 104857600
 },
  ...
 }
So what are interesting statistics to collect? The array basicStats sounds like a good candidate as it contains keys like: 'diskUsed', 'memUsed', 'diskFetches', 'quotaPercentUsed', 'opsPerSec', 'dataUsed', 'itemCount' All of those sound great values to graph, so we will keep/collect them. Then there is the quota object, showing ram which is useful to graph as well, so we keep this too. Finally there is nodes which is an array. This object is an array because it includes statistics for each node forming the cluster. If the bucket does not occupy more than one nodes, there will be a single entry in this array. In my setup, the gamesim-sample Bucket spans across two virtual machines, hence 'nodes' contains two items in its array corresponding to each vm. Following I am showing side-by-side the keys used for each of nodes array members (note that this is for the gamesim-sample bucket):
nodes[0] nodes[1]
 ==================== ====================
 clusterCompatibility clusterCompatibility 
 clusterMembership clusterMembership 
 couchApiBase couchApiBase 
 hostname hostname 
 interestingStats interestingStats 
 mcdMemoryAllocated mcdMemoryAllocated 
 mcdMemoryReserved mcdMemoryReserved 
 memoryFree memoryFree 
 memoryTotal memoryTotal 
 os os 
 otpNode otpNode 
 ports ports 
 replication replication 
 status status 
 systemStats systemStats 
  thisNode
 uptime uptime 
 version version
thisNode is a boolean that helps us understand which array member corresponds to the machine we are querying. In this case I got those stats from: https://192.168.60.100:8091/pools/default/buckets/gamesim-sample data['nodes'][1]['thisNode'] True To determine exactly which stats refer to which node, the couchApiBase key can be used for more detail: data['nodes'][1]['couchApiBase'] u'https://192.168.60.100:8092/gamesim-sample' data['nodes'][0]['couchApiBase'] u'https://192.168.60. 101:8092/gamesim-sample' This further confirms that nodes[0] refers to my second vm (192.168.60.101) and nodes[1] to the first vm.

Installing/Configuring the Couchbase collector on Diamond

Get the Couchbase Collector and copy it under: /usr/share/diamond/collectors/couchbase_collector/couchbase_collector.py Edit the python file couchbase_collector.py and enter your IP/port/name of databag/username/password; mine looks like so: ... class CouchBaseCollector(diamond.collector.Collector): def get_default_config(self): config = super(CouchBaseCollector, self).get_default_config() config.update({ 'host': 'localhost', 'port': 8091, 'path': 'beer-sample', 'username': 'Administrator', 'password': 'obfuscated' }) return config ... You will also need to create a config file under: /etc/diamond/collectors/CouchBaseCollector.conf With the contents: $ cat CouchBaseCollector.conf enabled = True

Cluster Metrics

The collector has the intelligence to present only the nodes statistics that are applicable for the node it polls. For clustered couchbase environments, every node will be running a diamond collector of it's own. This is how Graphite presents the two nodes of the cluster, corresponding to my two vm's: Graphite_Cluster_stats

No Comments Yet

Let us know what you think

Subscribe by email