Monitoring Linux performance with Grafana

We spent a bit of time setting up Linux (in our case, CentOS) as a home router due to frustration with home routers available in the market. It was both a good exercise and a bit of nostalgia from our early days with Linux. After we finished getting to know the basics of customization, we wanted to find a way to keep track of various statistics such as network traffic, disk space usage, etc. Venerable Cacti is certainly an option, but that feeling is a bit of a legacy these days. … We prefer to use a newer tool with a more modern look. This is what brought us to Grafana. This is a basic step by step guide for how we installed it. This is a basic installation that includes collectd, InfluxDB, and Grafana on the same host.

Collectd

What, you thought we’d launch Grafana right away? We need the data to collect the first, and the best way to do this on CentOS, is through collectd.

The easiest way to get collectd on CentOS is through the EPEL repository. If you are new to CentOS or unfamiliar with the Fedora EPEL repo, this command is all you need to get started:

yum install epel-release

Now that EPEL repo is enabled, it’s easy enough to install collectd in the same way:

yum install collectd

There are other collectd plugins available in EPEL, but the base is sufficient for our purposes. We would like to encourage you to research the available plugins if the basic module does not meet your needs.

Now that collectd is installed, we need to configure it to send data. Collectd generates statistics, but we have to install it somewhere so Grafana can use it.

There are several items in /etc/collectd.conf that we need to tweak. In the global section, uncomment the lines for Hostname, BaseDir, PIDFile, PluginDir, and TypesDB. You need to change the Hostname, but the default should be fine for everyone else. It should look something like this:

Hostname    "YourHostNameHere"
#FQDNLookup   true
BaseDir     "/var/lib/collectd"
PIDFile     "/var/run/collectd.pid"
PluginDir   "/usr/lib64/collectd"
TypesDB     "/usr/share/collectd/types.db"

Now that we have the basic information of the application, we need to enable the plugin that we want to use. So, for example, we have syslog, cpu, disk, interface, load, memory, and network, uncomment them. Their defaults are good for everything but the web. The network module is used to transfer data to our data collector, which in this case is InfluxDB. The network plugin will need to point to your InfluxDB server. Because we are doing everything locally in this example, we are pointing to localhost. It should look like this:

<Plugin network>
  Server "127.0.0.1" "8096"
</Plugin>

InfluxDB

Now that we’re done with collectd, we need to configure InfluxDB to pull the generation data into collectd. Since InfluxDB is not in EPEL, we have to pull it out of our repository. The command below makes it easy:

cat <<EOF > /etc/yum.repos.d/influxdb.repo
[influxdb]
name = InfluxDB Repository - RHEL $releasever
baseurl = https://repos.influxdata.com/centos/$releasever/$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key
EOF

Once that is done, install the yum install influxdb package, it is now ready to be configured. There are a few more details that we need to change in the /etc/influxdb/influxdb.conf configuration file.

In section [HTTP] your /etc/influxdb/influxdb.conf, set enabled = true and bind-address = “: 8096”. It should look like this:

[http]
  # Determines whether HTTP endpoint is enabled.
enabled = true

  # The bind address used by the HTTP service.
bind-address = ":8086"

Then go to section [[collectd]]and configure it as follows:

[[collectd]]
  enabled = true
  bind-address = ":8096"
  database = "collectd"
  typesdb = "/usr/share/collectd"

At this point, we can go ahead and start both services to make sure they are working as expected. First, let’s turn on collectd and make sure it sends data. As with other services, we will use system for this. In the example below, you will see the commands used and the output of the running daemon collectd

[[email protected] ~]$ sudo systemctl enable collectd
[[email protected] ~]$ sudo systemctl start collectd
[[email protected] ~]$ sudo systemctl status collectd
● collectd.service - Collectd statistics daemon
   Loaded: loaded (/usr/lib/systemd/system/collectd.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2017-08-02 11:22:18 PDT; 6min ago
     Docs: man:collectd(1)
           man:collectd.conf(5)
 Main PID: 18366 (collectd)
   CGroup: /system.slice/collectd.service
           └─18366 /usr/sbin/collectd

Aug 2 11:22:18 monitor collectd[18366]: plugin_load: plugin "disk" successfully loaded.
Aug 2 11:22:18 monitor collectd[18366]: plugin_load: plugin "interface" successfully loaded.
Aug 2 11:22:18 monitor collectd[18366]: plugin_load: plugin "load" successfully loaded.
Aug 2 11:22:18 monitor collectd[18366]: plugin_load: plugin "memory" successfully loaded.
Aug 2 11:22:18 monitor collectd[18366]: plugin_load: plugin "network" successfully loaded.
Aug 2 11:22:18 monitor collectd[18366]: Systemd detected, trying to signal readyness.
Aug 2 11:22:18 monitor collectd[18366]: Initialization complete, entering read-loop.
Aug 2 11:22:18 monitor systemd[1]: Started Collectd statistics daemon.

Now that collectd is running, start InfluxDB and make sure it can collect data from collectd.

[[email protected] ~]$ sudo systemctl enable influxdb
[[email protected] ~]$ sudo systemctl start influxdb
[[email protected] ~]$ sudo systemctl status influxdb
● influxdb.service - InfluxDB is an open-source, distributed, time series database
   Loaded: loaded (/usr/lib/systemd/system/influxdb.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2017-07-29 18:28:20 PDT; 1 weeks 6 days ago
     Docs: https://docs.influxdata.com/influxdb/
 Main PID: 23459 (influxd)
   CGroup: /system.slice/influxdb.service
           └─23459 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

Aug 2 10:35:10 monitor influxd[23459]: [I] 2017-08-12T17:35:10Z SELECT mean(value) FROM collectd.autogen.cpu_value WHERE host =~ /^monitor$/ AND type_instance="interrupt" AND time > 417367h GR...) service=query
Aug 2 10:35:10 monitor influxd[23459]: [httpd] 172.20.1.40, 172.20.1.40,::1 - - [12/Aug/2017:10:35:10 -0700] "GET /query?db=collectd&epoch=ms&q=SELECT+mean%28%22value%22%29+FROM+%22load_shortte...ean%28%22value%
Aug 2 10:35:10 monitor influxd[23459]: [I] 2017-08-02T17:35:10Z SELECT mean(value) FROM collectd.autogen.cpu_value WHERE host =~ /^monitor$/ AND type_instance="nice" AND time > 417367h GROUP B...) service=query

As we can see in the output above, the service is running and data is being collected. From here, the only thing left to do is submit it through Grafana.

Grafana

To install Grafana, we will create another repository, as we did with InfluxDB. Unfortunately, the Grafana folks do not keep the release version separately in the repo, it looks like we are using the EL6 repo even if we are doing this work in EL7.

cat <<EOF > /etc/yum.repos.d/grafana.repo
[grafana]
name=grafana
baseurl=https://packagecloud.io/grafana/stable/el/6/$basearch
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packagecloud.io/gpg.key https://grafanarel.s3.amazonaws.com/RPM-GPG-KEY-grafana
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
EOF

Now that the repository is in place and enabled, we can install Grafana, just as we did with the others: yum install grafana. Once this is done, we can start working on the configuration. For this tutorial, we’re only going to set an admin username and password, because we’re doing this for the tutorial and one user instance. I would absolutely recommend that you read the docs if you want to start doing a little more with Grafana.

A basic configuration is enough to accomplish this, just uncomment the admin_user and admin_password lines in the section [security] in the /etc/grafana/grafana.ini file and set your own values. In this case, we are using admin / admin because that’s what you do in the examples, right?

[security]
# default admin user, created on startup
admin_user = admin

# default admin password, can be changed before first start of grafana,  or in profile settings
admin_password = admin

Monitoring Linux performance with GrafanaNow you can start Grafana with systemctl start grafana-server and configure it through the web interface. After you log in for the first time, you will be prompted to set up a few things, including the data source and dashboard. Because we are doing all this on localhost, you will be able to cheat and use the datasource settings in the screenshot. Don’t worry, we’re almost there and there is only little left to do.

If you have a data source configured, you will be prompted to create your first panel. Although you can of course make it a little intimidating for the first launch with Grafana. A simple solution to this problem is to import one of the templates on the Grafana site. We used https://grafana.com/dashboards/554. It provides a good group of metrics and graphs as a base to use and plot.

Once you have everything set up, it’s time to start for your personal preferences and further tinkering. Once again, we would recommend that you read the documentation, because there are tons of options and changes not described here.

Sidebar