Any experienced sysadmin would tell you that configuring and deploying servers is simple compared to Ops. Thankfully, with a Monitoring System, you can let the servers do the hard work for you. Without one, you would be sailing your infra blindly, wondering when the next iceberg will show up.

Nagios is the grandfather of monitoring solutions. There is no doubt that Nagios has transformed the monitoring landscape since its inception in 1999. But! (Here comes the twist in any good tale) Rumour has it that using it results in massive hair loss.

Disclaimer: I have never operated Nagios so that explains my healthy hair. I did, however, attempt to set it up using the provided script and it was a royal PITA. Today, we have a good number of alternatives to Nagios that fulfils your monitoring needs without causing sysadmin depression.

In this walkthrough, we explore Sensu the monitoring server, paired with metrics collector Graphite and Carbon. On the frontend, we have Grafana to plot performance charts and Uchiwa to report on Sensu’s activity.

Sensu can operate in 2 ways - Subscription or Standalone - or both. In Subscription mode, the Sensu server polls subscribed clients at regular intervals, enabling the chronological correlation of checks and metrics. This is likely what you want.

In the alternative Standalone mode, clients instead push data to the server. This would be useful when an app sends event-based ad-hoc metrics to the monitoring server.

In this article, we describe the process to setup Subscription-based checks and metrics. Take note that the terms Server and VM will be used interchangeably to refer to both physical and virtual infra.

Architecture

The Sensu dev team has described Sensu’s architecture in detail. With that as the basis, I add Uchiwa, Graphite, Carbon, and Grafana to the picture.

First off, Uchiwa. I’m reminded of Hachikō, but it’s actually the name of a Japanese fan. Sensu is a Japanese folding fan. But I digress… Uchiwa is Sensu’s snazzy dashboard that polls sensu-api’s REST interface.

Uchiwa's dashboard

Uchiwa’s dashboard

Graphite, Carbon, and Whisper are 3 sisters that work hand-in-hand. In most writeups, the metrics-collector daemon carbon-cache is most often referred to as Carbon (there’s also carbon-aggregator and carbon-relay). Carbon stores the metrics in a fixed-size database called Whisper. Graphite is the bare-bones frontend that charts the metrics. Graphite is OK, but the interface is dated. Grafana is the modern and responsive frontend that brings back the joy in charting. Grafana makes this happen by pulling data from Graphite’s API by polling it at intervals.

Grafana's dashboard

Grafana’s dashboard

With all the parts in place, the network topology would look like this:

  1. sensu-server -> carbon:2003
  2. Internet -> apache2:443 -> grafana:3000 -> apache2:80 -> graphite-web
  3. Internet -> apache2:443 -> uchiwa:3001 -> sensu-api:4567

The installation steps have been tested on Ubuntu 14.04. Also, it assumes that all of the monitoring server’s dependencies - Redis, RabbitMQ, Sensu Server, Uchiwa, Graphite/Carbon, Grafana - reside in a single virtual host.

The Server is the host that collects the data, while Client reports its checks and metrics.

Putting the Pieces Together

The guide was pieced together with help from Sensu’s guide, DigitalOcean, Monitor Everything.

The installation and configuration walkthrough is organised as such:

  1. Graphite and Carbon: The Metrics Collector
  2. Redis: Sensu’s Storage Engine
  3. RabbitMQ: Sensu’s Transport
  4. Sensu Server: The Monitoring Solution
  5. Sensu Client
  6. Grafana: Chart the Performance Metrics
  7. Uchiwa: A Dashboard for Sensu
  8. Explore Uchiwa and Grafana
  9. Troubleshooting

Graphite and Carbon: The Metrics Collector

  1. Install Graphite and Carbon. PostgreSQL is the recommended database for Graphite but since we’re using Grafana as the frontend, the default SQLite database will do.

     sudo apt update
     sudo apt install -y graphite-carbon graphite-web apache2 libapache2-mod-wsgi
    
  2. Edit /etc/default/graphite-carbon to enable Carbon at startup.

     # Change to true, to enable carbon-cache on boot
     CARBON_CACHE_ENABLED=true
    
  3. Start carbon-cache.

     sudo service graphite-carbon start
    
  4. Configure Graphite’s basic parameters in /etc/graphite/local_settings.py.

     SECRET_KEY = 'salted_peanuts_hashed_browns'
     TIME_ZONE = 'Asia/Singapore'
    
  5. Prepare Graphite’s SQLite database.

     sudo -u _graphite graphite-manage syncdb --noinput
    
  6. Setup a VirtualHost in Apache to front Graphite.

     sudo cp /usr/share/graphite-web/apache2-graphite.conf /etc/apache2/sites-available/graphite.conf
     sudo a2dissite 000-default
     sudo a2ensite graphite
     sudo service apache2 reload
    

Redis: Sensu’s Storage Engine

  1. Install Redis Server.

     sudo apt update
     sudo apt install -y redis-server
    
  2. Verify that Redis works using redis-cli ping. If Redis replies PONG, you’re good to go. Otherwise, rinse, wash, and repeat.

  3. Uncomment ULIMIT in /etc/default/redis-server.

     ULIMIT=65536
    

RabbitMQ: Sensu’s Transport

  1. Install Erlang.

     wget http://packages.erlang-solutions.com/erlang-solutions_1.0_all.deb
     sudo dpkg -i erlang-solutions_1.0_all.deb
     sudo apt update
     sudo apt install -y erlang-nox=1:18.2
    
  2. Install RabbitMQ.

     sudo wget http://www.rabbitmq.com/releases/rabbitmq-server/v3.6.0/rabbitmq-server_3.6.0-1_all.deb
     sudo dpkg -i rabbitmq-server_3.6.0-1_all.deb
    
  3. Create a RabbitMQ user and vhost that for the clients to authenticate to the RabbitMQ server.

     sudo rabbitmqctl add_vhost /sensu
     sudo rabbitmqctl add_user sensu aGreatPassword
     sudo rabbitmqctl set_permissions -p /sensu sensu ".*" ".*" ".*"
    
  4. Increase the number of file handles in /etc/default/rabbitmq-server.

     ulimit -n 65536
    
  5. Tell RabbitMQ to listen on all interfaces in /etc/rabbitmq/rabbitmq.config. By default, it listens on localhost.

     [
       {rabbit,
         [
           {tcp_listeners, [{"0.0.0.0", 5672}]}
         ]
       }
     ].
    
  6. RabbitMQ is started immediately after installation. Restart RabbitMQ for changes to take effect.

     service rabbitmq-server restart
    

Sensu Server: The Monitoring Solution

  1. Install Sensu Core.

     wget -q http://sensu.global.ssl.fastly.net/apt/pubkey.gpg -O- | sudo apt-key add -
     sudo echo "deb http://sensu.global.ssl.fastly.net/apt sensu main" | sudo tee /etc/apt/sources.list.d/sensu.list
     sudo apt update
     sudo apt install -y sensu
     sudo update-rc.d sensu-server defaults
     sudo update-rc.d sensu-api defaults
     sudo update-rc.d sensu-client defaults
    
  2. In /etc/sensu/conf.d/redis.json, add the config for Redis.

     {
       "redis": {
         "host": "127.0.0.1",
         "port": 6379
       }
     }
    
  3. In /etc/sensu/conf.d/rabbitmq.json, add the config for RabbitMQ.

     {
       "rabbitmq": {
         "host": "127.0.0.1",
         "port": 5672,
         "vhost": "/sensu",
         "user": "sensu",
         "password": "aGreatPassword"
       }
     }
    
  4. In /etc/sensu/conf.d/transport.json, specify RabbitMQ as the default transport.

     {
       "transport": {
         "name": "rabbitmq",
         "reconnect_on_error": true
       }
     }
    
  5. In /etc/sensu/conf.d/api.json, add the config for Sensu API.

     {
       "api": {
         "host": "localhost",
         "bind": "127.0.0.1",
         "port": 4567
       }
     }
    
  6. In /etc/sensu/conf.d/client.json, add the config for Sensu Client to monitor the monitoring server for monitor-inception. For more options, read this page.

     {
       "client": {
         "name": "sensu-server",
         "address": "sensu-server.com",
         "environment": "development",
         "subscriptions": [
           "dev",
           "ubuntu"
         ],
         "socket": {
           "bind": "127.0.0.1",
           "port": 3030
         }
       }
     }
    
  7. In /etc/sensu/conf.d/checks.json, configure the checks to conduct on subscribed hosts. This example enables CPU, memory, and disk usage checks on the dev group. Output of checks is not relayed to Graphite because this only for resource monitoring.

     {
       "checks": {
         "check-cpu": {
           "command": "check-cpu.rb",
           "interval": 5,
           "subscribers": ["dev"]
         },
         "check-memory": {
           "command": "check-memory.rb",
           "interval": 5,
           "subscribers": ["dev"]
         },
         "check-disk-usage": {
           "command": "check-disk-usage.rb",
           "interval": 60,
           "subscribers": ["dev"]
         }
       }
     }
    
  8. In /etc/sensu/conf.d/relay.json, setup the Graphite relay. This sends metrics to the carbon-cache listening at localhost:2003. only_check_output is Sensu’s built-in mutator plugin that takes Sensu’s JSON results as input and outputs the value of the output key. The parameter "mutator": "only_check_output" does just this, otherwise the entire JSON message would be sent.

     {
       "handlers": {
         "graphite_tcp": {
           "type": "tcp",
           "socket": {
             "host": "127.0.0.1",
             "port": 2003
           },
           "mutator": "only_check_output"
         }
       }
     }
    
  9. With the relay configured, edit /etc/sensu/conf.d/metrics.json for metrics collection. "type": "metric" is an important parameter that tells Sensu to always send results to the relay. handlers points to graphite_tcp which was configured in the previous step.

     {
       "checks": {
         "metrics-cpu": {
           "type": "metric",
           "command": "metrics-cpu-pcnt-usage.rb",
           "interval": 5,
           "subscribers": ["dev"],
           "handlers": ["graphite_tcp"]
         },
         "metrics-memory": {
           "type": "metric",
           "command": "metrics-memory-percent.rb",
           "interval": 5,
           "subscribers": ["dev"],
           "handlers": ["graphite_tcp"]
         }
       }
     }
    
  10. Install checks for the Sensu Client to work.

     sudo sensu-install -P cpu-checks,memory-checks,disk-checks
    
  11. Start Sensu Core.

     service sensu-server start
     service sensu-api start
     service sensu-client start
    

Sensu Client

Sensu Client is the host that execute the checks issued by Sensu Server. For this exercise, this should be done in another VM.

  1. Install Sensu Core

     wget -q http://sensu.global.ssl.fastly.net/apt/pubkey.gpg -O- | sudo apt-key add -
     sudo echo "deb http://sensu.global.ssl.fastly.net/apt sensu main" | sudo tee /etc/apt/sources.list.d/sensu.list
     sudo apt update
     sudo apt install -y sensu
     sudo update-rc.d sensu-client defaults
    
  2. In /etc/sensu/conf.d/rabbitmq.json, add the config for the RabbitMQ transport.

     {
       "rabbitmq": {
         "host": "<Sensu_Server_IP>",
         "port": 5672,
         "vhost": "/sensu",
         "user": "sensu",
         "password": "aGreatPassword"
       }
     }
    
  3. In /etc/sensu/conf.d/transport.json, set RabbitMQ as the default transport.

     {
       "transport": {
         "name": "rabbitmq",
         "reconnect_on_error": true
       }
     }
    
  4. In /etc/sensu/conf.d/client.json, configure sensu-client.

     {
       "client": {
         "name": "sensu-client",
         "address": "sensu-client.mon",
         "environment": "development",
         "subscriptions": [
           "dev",
           "ubuntu"
         ],
         "socket": {
           "bind": "127.0.0.1",
           "port": 3030
         }
       }
     }
    
  5. Install checks to be executed by the host. A list of free Sensu plugins can be found here.

     sudo sensu-install -P cpu-checks,memory-checks,disk-checks
    
  6. If Sensu does not provide the checks plugin you need, you could write a full-blown Sensu plugin to perform checks but there’s a simpler solution. Simply wrap the message in JSON and write the message directly to the sensu-client socket. Here’s an example.

     echo '{"name": "app_01", "output": "could not connect to mysql", "status": 1}' > /dev/tcp/localhost/3030
    

Grafana: Chart the Performance Metrics

Grafana polls Graphite for metrics and graphs it on a beautifully-designed dashboard.

  1. Install Elasticsearch.

     wget -q http://packages.elasticsearch.org/GPG-KEY-elasticsearch -O- | sudo apt-key add -
     sudo echo "deb http://packages.elasticsearch.org/elasticsearch/1.0/debian stable main" | sudo tee /etc/apt/sources.list.d/elasticsearch.list
    
     sudo apt update
     sudo apt install -y elasticsearch openjdk-7-jre-headless
     sudo update-rc.d elasticsearch defaults
    
     sudo service elasticsearch start
    
  2. Install Grafana.

     wget -q https://packagecloud.io/gpg.key -O- | sudo apt-key add -
     sudo echo "deb https://packagecloud.io/grafana/stable/debian/ wheezy main" | sudo tee /etc/apt/sources.list.d/grafana.list
    
     sudo apt update
     sudo apt install -y grafana
     sudo update-rc.d grafana-server defaults 95 10
     sudo service grafana-server start
    
  3. In /etc/apache2/sites-available/grafana.conf, configure Apache2 as a reverse proxy for Grafana. <grafana_fqdn> is the externally-resolvable Fully-Qualified Domain Name (FQDN) of Grafana’s URL. For example, grafana.mydomain.com.

     <VirtualHost *:443>
             ServerName <grafana_fqdn>
    
             SSLEngine On
             SSLCertificateFile /etc/ssl/certs/ssl-cert-snakeoil.pem
             SSLCertificateKeyFile /etc/ssl/private/ssl-cert-snakeoil.key
    
             ProxyPass / http://127.0.0.1:3000/
             ProxyPassReverse / http://127.0.0.1:3000/
     </VirtualHost>
    
  4. Enable Apache’s proxy modules and restart it.

     sudo a2enmod proxy proxy_http ssl
     sudo a2ensite grafana
     sudo service apache2 restat
    

Uchiwa: A Dashboard for Sensu

Uchiwa is Sensu’s dashboard that is available as a package from Sensu’s servers. If you have followed the steps above, you can install it without updating your repository.

  1. Install Uchiwa.

     sudo apt install -y uchiwa
     sudo update-rc.d uchiwa defaults
    
  2. In /etc/sensu/uchiwa.json, configure Uchiwa to listen on port 3001 because Grafana is using port 3000.

     {
       "sensu": [
         {
           "name": "Site 1",
           "host": "127.0.0.1",
           "port": 4567
         }
       ],
       "uchiwa": {
         "host": "127.0.0.1",
         "port": 3001
       }
     }
    
  3. Start Uchiwa

     sudo service uchiwa start
    
  4. In /etc/apache2/sites-available/uchiwa.conf, configure Apache2 as a reverse proxy for Uchiwa. Similar to Grafana, fill in <uchiwa_fqdn> with Uchiwa’s FQDN

     <VirtualHost *:443>
             ServerName <uchiwa_fqdn>
    
             SSLEngine On
             SSLCertificateFile /etc/ssl/certs/ssl-cert-snakeoil.pem
             SSLCertificateKeyFile /etc/ssl/private/ssl-cert-snakeoil.key
    
             ProxyPass / http://127.0.0.1:3001/
             ProxyPassReverse / http://127.0.0.1:3001/
     </VirtualHost>
    
  5. Enable Apache2 modules and reload it

     sudo a2ensite uchiwa
     sudo service apache2 reload
    

Explore Uchiwa and Grafana

With all of the services up and running, try accessing Uchiwa and Grafana at their respective URLs. Uchiwa is not password-protected by default, while Grafana’s default login is admin/admin.

Use this great guide to configure Grafana’s dashboard.

Troubleshooting

You probably brainfarted and missed a few steps and that’s why you’re here. It’s nice of you to want to read the entire walkthrough :)

Do not fear! You’ll find yourself back on track in no time. Before you begin troubleshooting, it’s important to read the logs of Redis, RabbitMQ, Sensu, and Uchiwa. Here are some errors that you may encounter along the way.

  1. Sensu Client cannot contact the RabbitMQ transport
    • RabbitMQ isn’t started
    • RabbitMQ credentials are incorrect
    • The server’s hostname has changed. RabbitMQ’s vhost is tied to the name of the host, so create a new RabbitMQ account if this is the case.
  2. Uchiwa cannot contact the Sensu API
    • Sensu API isn’t started
    • RabbitMQ isn’t started
    • RabbitMQ credentials are incorrect
    • The server’s hostname has changed. RabbitMQ’s vhost is tied to the name of the host, so create a new RabbitMQ account if this is the case.
  3. Can’t access Uchiwa or Grafana
    • Uchiwa and Grafana would be running at ports 3001 and 3000 respectively. Try accessing them directly with Apache2 out of the way. If that doesn’t work, it’s likely that they are not running.

Gimme More!

Kyle Anderson has a free Introductory course on Sensu and an Intermediate course at a all-time low price of S$18. Otherwise, you can also swallow the course notes.

In Andy Sykes colourful talk Stop using Nagios (so it can die peacefully), he explains why Nagios should fade into oblivion and hailed Sensu as its replacement.

Wrap Up

I certainly hope that setting up your monitoring solution was a walk in the park. Stay tuned for more technical articles!