One aspect that every organization that leverages technological power strives for is the ability to distinguish the health of their computer systems and the applications running on them. Being able to know when things aren’t working as expected really improves performance and reduces the time to troubleshoot exceptions. In order to successfully achieve this, some tools must be your best friends, because they will help you to conduct careful research. So, for this, there are many tools that we can use to collect and process what is happening in network devices and servers (physical or virtual).
We will explore the best open source monitoring tools you can use in your infrastructure to give you a complete picture of the state of your infrastructure.
LibreNMS is an automatic discovery based on PHP / MySQL / SNMP network monitoring, which includes support for a wide range of network hardware and operating systems including Cisco, Linux, FreeBSD, Juniper, Brocade, Foundry, HP, etc. Best for network equipment and servers.
The advantage of libreNMS is that it can be discovered automatically. You don’t have to tell it if your device is based on Cisco, Juniper, Windows or Linux. It uses CDP, FDP, LLDP, OSPF, BGP, SNMP, and ARP to automatically collect this information, such as Super Charm.
It put in more effort and discovered the interfaces on the router or switch, which is impressive. It also tries to map the connection details of the network, but needs your help.
Like most monitoring tools, libreNMS also has highly customizable monitoring capabilities.
It can be scaled
With the development of the network, its distributed polling function allows horizontal expansion of the system. LibreNMS has a billing system. Yes, this tool has one. This can be done by generating bandwidth bills for ports on the network based on usage or transmission.
LibreNMS has Andriod and Apple Apps that can be used to view and manage your network. This is really fresh air. Support or various authentication mechanisms, such as radius, LDAP, Active Directory, etc.
Generate bandwidth charges for ports on the network based on usage or transmission.
You can integrate it into any other system through its API access. This tool is a beast, so we recommend that you understand what is happening inside its engine. The security of the article is far beyond the scope of this article.
How to install LibreNMS on CentOS 8 / RHEL 8
How to install and configure LibreNMS on Ubuntu 18.04 LTS using Nginx
Install LibreNMS monitoring tool on CentOS 7 using Letsencrypt and Nginx
From nagios.org, “Nagios monitors the entire IT infrastructure to ensure that systems, applications, services, and business processes function properly. In the event of a failure, Nagios can alert technicians to the problem so they can impact business processes in the event of a failure, and ultimately The user or customer started the remediation process before. “
It is a tool that dates back to 1999 and now includes other products, but all focus on surveillance. Let’s see what it has for you to consider.
Monitor a large number of devices
Nagios has the ability to monitor applications, services, operating systems, network protocols, system metrics and infrastructure components using a single tool. If you want a tool to cover a wide range of services and equipment, it will be a masterpiece for all industries.
Since interested stakeholders can see the status of the infrastructure in real time, having many users log in to the interface at the same time can improve efficiency and even improve your business. It can also restrict views to user-specific networks, so more views can be accommodated in one platform. You can only see what belongs to you.
Nagios ensures that service level agreements are met by generating reports, which can be enhanced by plug-ins from third-party vendors. This makes it highly flexible and customizable.
With a centralized web interface, you can see everything so that failures can be easily detected.
Nagios has an alert function. Alerts can be sent via SMS and email, simplifying infrastructure management.
An interesting feature of Nagios is how event handlers allow automatic restart of failed applications and services.
Install and configure Nagios 4 on RHEL 8 / CentOS
On its site, “Zabbix is the ultimate enterprise-grade software designed to monitor millions of metrics collected from thousands of servers, virtual machines, and network devices in real-time.” Monitor Windows, Solaris, IBM AIX. It has functions to monitor applications, services, databases, and more.
Zabbix contains many features and we will briefly introduce them.
Solutions for any IT infrastructure, services, applications, resources-
Next generation Zabbix agent
Zabbix 4.4 introduces a new type of agent, zabbix_agent2, which provides a wide range of new and advanced monitoring capabilities
It has multiple methods to collect the required indicators, including
- Multi-platform Zabbix agents (Zabbix agents can run on a variety of supported platforms, including Linux, UNIX, and Windows, and collect data from devices such as CPU, memory, disk, and network interface usage.)
- SNMP and IPMI agents
- Agentless monitoring of user services
- Custom method
- Calculations and rollups and end-user web monitoring
Detect anomalies in settings
Zabbix is able to automatically detect problem states in the incoming metric stream in an automated manner using defined smart thresholds
Better visual presentation
According to Zabbix developers, the interface provides users with multiple ways to present a visual overview of your infrastructure and environment. These can be in the form of widget-based dashboards, graphics, web maps, and slide shows.
The server can send messages or emails. As far as alerts go, more can be done. For example, you can customize messages based on the role of the recipient or runtime and inventory information. In addition, Zabbix event correlation mechanisms can be used to configure messages to focus on the root cause of the problem.
Use of templates: This feature allows you to use out-of-the-box templates for most popular platforms and monitor thousands of similar devices by using configuration templates
Zabbix uses agents to send collected information in its environment located in a central Zabbix server. Using Zabbix agents can greatly simplify the maintenance of the environment monitored by Zabbix and improve the performance of the central Zabbix server. This shows how the surveillance system can be extended in a distributed manner. Zabbix has an API, so it can be used to integrate it into any system in the infrastructure.
Official support for TimescaleDB
How to install Zabbix Server 4 on Debian 10 Buster
How to install Zabbix server on CentOS 8 / RHEL 8
How to install Zabbix Server 4.0 on CentOS 7
How to install Zabbix Server 4.0 on Ubuntu 18.04 and Ubuntu 16.04 LTS
According to Prometheus GitHub pageThis is a Cloud Native Computing Foundation project for monitoring systems and services. It collects metrics from configured targets at given intervals, evaluates regular expressions, displays results, and triggers alerts when certain conditions are found to be true. It is suitable for both machine-centric monitoring and monitoring of highly dynamic service-oriented architectures. For graphic visualization, Prometheus supports tools for data visualization and export, such as Grafana.
Key features of Prometheus
- It is a multi-dimensional data model (time series defined by indicator names and key / value dimension sets)
- Flexible query language can take advantage of this dimension
- Does not rely on distributed storage; single server nodes are autonomous
- Time series collection is done via a pull model over HTTP
- Support push time series through intermediate gateway
- Discovering targets through service discovery or static configuration
- Multiple graphics and dashboard support modes
- Supports stratified and horizontal joints
Install Prometheus Server on Debian 10/9 and Ubuntu 20.04 / 18.04
Install Prometheus Server on CentOS 7 / Ubuntu 18.04
How to install Prometheus and node_exporter on Debian 10 (Buster)
5. Network data
From them GitHub pageNetdata is distributed, real-time, performance, and health monitoring for systems and applications. It is a highly optimized monitoring agent that you install on all systems and containers. It uses a highly interactive web dashboard to provide unparalleled insight into everything that happens on its running systems (including web servers, databases, applications). Another cool feature of Netdata is that it can run autonomously without any third-party components or can be integrated into existing monitoring tool chains, such as Prometheus, Graphite, OpenTSDB, Kafka, Grafana, etc.
Netdata is the monitoring agent you install on all systems. it is:
- Metrics Collector-Used for system and application metrics (including web servers, databases, containers, etc.)
- Time series database-all stored in memory (no disk contact at runtime)
- Metrics visualizer-super fast, interactive, modern and optimized for anomaly detection
- Alert notification engine-advanced watchdog for detecting performance and usability issues
- 1s granularity-the highest resolution of all indicators.
- Unlimited Metrics-Collect all available metrics, the more the better.
- Single-core CPU utilization is 1%-super fast and incredible optimization.
- Several MB of RAM-By default, it uses 25MB of RAM. You resize.
- Zero Disk I / O-At runtime, it does not load or save anything (except error and access logs).
- Zero Configuration-Automatically detects everything, out of the box, and collects up to 10,000 metrics per server.
- Zero maintenance – you just run it and the rest is done.
- Zero dependency-it is even its own web server for its static web files and web APIs.
- Scale to unlimited-you can install it on all servers, containers, VMs and IoT.
- Several modes of operation-autonomous host monitoring (default setting), headless data collector, forwarding agent, store and forward agent, central multi-host monitoring
Health monitoring and alerting
Complex alerts-out of the box, with hundreds of alerts included! Notice: No matter you use Telegram, Twilio, Email, kavenegar, messagebird, etc., it can serve you.
- Stunning interactive dashboard-friendly mouse, touchpad and touchscreen with dark and white themes
- Amazingly fast visualizations-even on low-end hardware, each metric can respond to all queries in less than 1 ms.
- Embeddable-its charts can be embedded in your web pages, wikis and blogs.
What does it monitor
Netdata data collection is extensible – you can monitor anything that can get metrics. APM (Application Performance Monitoring), system resources, disk, file system, network, DNS server, virtual private network, proxy, balancer, accelerator.
How to install Netdata on RHEL 8 / CentOS 8
How to install Netdata on FreeBSD 12
Install netdata on CentOS 7
6. Isinga 2
Icinga is a monitoring system that checks the availability of network resources, notifies users of interruptions and generates performance data for reporting. It is scalable and extensible to monitor large and complex environments in multiple locations.
Features of Icinga 2
The Icinga reporting module is the framework and foundation created by Icinga to process data collected by Icinga 2 and other data providers. It can display data directly in the Icinga web interface or export it to PDF, JSON or CSV format. With scheduled reports, you can regularly receive prepared data via email.
Graphs and indicators
Icinga uses graphite for graphics and measurements. It is a time series database for storing collected metrics and making them available through serene APIs and web interfaces.
You will get maps, business processes, certificate monitoring and dashboards.
You can use Logstash or Graylog in your infrastructure.
Notification script and interface.
There are many resources available, for example, different notification scripts, such as:
- Pager (XMPP, etc.)
- Ticketing system
Install and configure Icinga 2 and Icinga Web 2 on CentOS 8
How to install Icinga2 monitoring tool on Ubuntu 18.04 LTS
On the Cacti website, this tool “is a complete network graphics solution designed to take advantage of RRDTool’s data storage and graphics capabilities. Cacti provides out-of-the-box quick pollers, advanced graphics templates, multiple data collection methods, and User management capabilities. All this is encapsulated in an intuitive, easy-to-use interface that makes sense for LAN-sized installations and complex networks with thousands of devices. “(Cacti.net, 2020) .
Cactus utilizes the capabilities of RRDtool, an open source industry standard time series data logging and graphics system. This high-performance tool RRDtool can be easily and seamlessly integrated into scripting languages such as shell scripts, perl, python, ruby, lua or tcl applications.
The main features of Cactus include the following
Graphic templates enable common graphics to be grouped together through templates. Each field of a normal graph can be templated or specified on a per-graph basis.
Cactus has a data entry function. This gives users the freedom to develop custom scripts to collect data from target devices. However, it is bundled with SNMP, an industry data collection technology. More importantly, Cacti comes with a PHP-based poller that has the advantages of executing scripts, retrieving SNMP data, and updating RRD files.
Cactus has this rich feature to set up multiple users with their accounts. Administrators have the flexibility to assign a given portion of privileges to a given user.
There are three different ways to view graphics, namely tree view, list view and preview view. All three views have their advantages, for example, the tree view enables users to create a hierarchy of diagrams and the opportunity to place them on the tree. This way you can manage a large number of graphics. As the name suggests, the list view is just a list of available graphics, which when clicked will link you to the actual graphics. The last preview view visually displays all the graphics as a large list where you can quickly read and view the graphic graphics.
There are three different types of templates: data templates, graphic templates, and host templates. It eases the burden of defining all data sources and graphics without the need for templates at all. Data templates provide a framework for actual data sources. The host template groups all graphic templates and data queries for a given device type. Even more exciting is that you don’t need to create all the templates yourself. The templates are straightforward to use and have a very simple feature to import these templates into your Cactus platform.
Cacti can be configured to send email alerts if predefined variables or thresholds are exceeded or not reached. It’s a great night because you don’t have to start looking for problems when these calls come in. It will indicate that a service is down or facing a particular exception.
Cactus can generate reports based on your configuration.
Grafana Is a tool that no matter where you store it, you can query, visualize, alert and understand them. You will have the opportunity to create, explore and share dashboards with your team to promote a data-driven culture. In short, Grafana is an open source analysis and monitoring solution for every database.
Fast and flexible client diagram with multiple options. Panel plugins visualize indicators and logs in a variety of ways.
Use template variables to create dynamic and reusable dashboards that are displayed at the top of the dashboard.
Browse data with ad hoc queries and dynamic details. Split views compare different time ranges, queries, and data sources side by side.
Experience the magic of switching from metric to log using the retention tag filter. Quickly search all logs or live streaming.
Intuitively define alert rules for the most important metrics. Grafana will continuously evaluate and send notifications to systems such as Slack, PagerDuty, VictorOps, OpsGenie.
Mixed data source
Mix different data sources in the same picture! You can specify a data source on a per-query basis. This even applies to custom data sources.
Annotate graphs with rich events from different data sources. Hovering over the event displays full event metadata and labels.
With temporary filters, you can create new key / value filters on the fly, and these filters are automatically applied to all queries that use that data source.
9. Overview-monitoring your system
On its GitHub page https://github.com/nicolargo/glances, Glances is a cross-platform monitoring tool designed to display large amounts of monitoring information through curses or a web-based interface. This information is dynamically adjusted based on the size of the user interface.
Glances is written in Python and runs on almost any platform: GNU / Linux, FreeBSD, OS X, and Windows.
Export all system statistics to CSV, InfluxDB, Cassandra, OpenTSDB, StatsD, ElasticSearch and even RabbitMQ. Glances also provides a dedicated Grafana dashboard.
Display maximum information in the smallest space through curses or web-based interface.
It can dynamically adjust the displayed information according to the size of the terminal.
From its GitHub page, Sensu is an open source monitoring tool for temporary infrastructure and distributed applications. It is an agent-based monitoring system with built-in auto-discovery capabilities, making it ideal for cloud environments. It uses service checks to monitor service health and collect telemetry data.
- Server monitoring
- Container monitoring
- Real-time inventory
- Health checks and custom indicators
- Alarm and event management
- Automate repair and custom workflows
- More than 200 community plugins
- Namespaces and RBAC
- Basic certification
- Real-time event dashboard
- Real-time inventory dashboard
- Grafana data source
- Multitenant Dashboard (Single Site)
- Discovery, manifest, configuration management API
- Token-based API authentication (JWT)
service and support
- Bonsai (Custody Sensu Asset Index and CDN)
- Community support (discourse, leisure)
It should be noted that there is an enterprise version of Sensu that contains many other features. You can learn more here Enterprise Sensu Link.
The tools you use are yours now. Check them out and monitor them well throughout the year. Before you leave, you can read the other sweet guides below.