🐧 Best command line tools for monitoring and diagnosing Linux GPUs

A video card is a special board that controls what is displayed on a computer monitor.

It is also called a graphics processing unit (GPU), which computes 3D images and graphics for Linux games and other purposes.

Let’s take a look at the 7 best command line tools for Linux GPU monitoring, diagnostics and troubleshooting.

The following tools work on Linux for GPU monitoring and diagnostics, as well as other operating systems such as FreeBSD.

Most Linux and FreeBSD users these days use Nvidia, Intel and AMD GPUs.

Linux GPU Diagnostics

We can use the following tools to monitor, diagnose and test our Linux or * BSD based systems.

Finding GPU Information on Linux

To get GPU information, just run:

sudo lshw -C display -short
lspci -v | more

Output example:

H/W path               Device        Class          Description
===============================================================
/0/100/1/0                           display        TU117M [GeForce GTX 1650 Mobile / Max-Q]
/0/100/2               /dev/fb0      display        UHD Graphics 630 (Mobile)

1.glmark2 – stress testing GPU performance in Linux

glmark2 is a command line utility for testing OpenGL 2.0 and ES 2.0 performance.

We can install it like this:

$ sudo apt install glmark2

Now run it like this:

$ glmark2

He will then stress test your GPU on Linux:

My test result for an Nvidia GeForce GTX 1650 running on Ubuntu Linux 20.04 LTS:

=======================================================
    glmark2 2014.03+git20150611.fa71af2d
=======================================================
    OpenGL Information
    GL_VENDOR:     NVIDIA Corporation
    GL_RENDERER:   GeForce GTX 1650 with Max-Q Design/PCIe/SSE2
    GL_VERSION:    4.6.0 NVIDIA 450.80.02
=======================================================
[build] use-vbo=false: FPS: 4980 FrameTime: 0.201 ms
[build] use-vbo=true: FPS: 6927 FrameTime: 0.144 ms
[texture] texture-filter=nearest: FPS: 5144 FrameTime: 0.194 ms
[texture] texture-filter=linear: FPS: 4979 FrameTime: 0.201 ms
[texture] texture-filter=mipmap: FPS: 4030 FrameTime: 0.248 ms
[shading] shading=gouraud: FPS: 6358 FrameTime: 0.157 ms
[shading] shading=blinn-phong-inf: FPS: 5810 FrameTime: 0.172 ms
[shading] shading=phong: FPS: 6425 FrameTime: 0.156 ms
[shading] shading=cel: FPS: 5720 FrameTime: 0.175 ms
[bump] bump-render=high-poly: FPS: 4772 FrameTime: 0.210 ms
[bump] bump-render=normals: FPS: 7187 FrameTime: 0.139 ms
[bump] bump-render=height: FPS: 6724 FrameTime: 0.149 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 5278 FrameTime: 0.189 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 3649 FrameTime: 0.274 ms
[pulsar] light=false:quads=5:texture=false: FPS: 5793 FrameTime: 0.173 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 2776 FrameTime: 0.360 ms
[desktop] effect=shadow:windows=4: FPS: 3913 FrameTime: 0.256 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 1555 FrameTime: 0.643 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 1703 FrameTime: 0.587 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 1800 FrameTime: 0.556 ms
[ideas] speed=duration: FPS: 5480 FrameTime: 0.182 ms
[jellyfish] : FPS: 4283 FrameTime: 0.233 ms
[terrain] : FPS: 746 FrameTime: 1.340 ms
[shadow] : FPS: 4878 FrameTime: 0.205 ms
[refract] : FPS: 1580 FrameTime: 0.633 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 5081 FrameTime: 0.197 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 4556 FrameTime: 0.219 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 5293 FrameTime: 0.189 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 5048 FrameTime: 0.198 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 4602 FrameTime: 0.217 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 4744 FrameTime: 0.211 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 4515 FrameTime: 0.221 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 4948 FrameTime: 0.202 ms
=======================================================
                                  glmark2 Score: 4584 
=======================================================

2. glxgears is a simple tool for testing Linux GPU performance.

It will display the frame rate at regular intervals.

It has become quite popular as a primary benchmarking tool for Linux and Unix-like systems like FreeBSD.

Install and run it like this:

$ apt install mesa-utils
$ glxgears

The GPU frame rate is measured and displayed on the screen every five seconds.

The final output will look like this:

Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
299 frames in 5.0 seconds = 59.416 FPS
299 frames in 5.0 seconds = 59.731 FPS
300 frames in 5.0 seconds = 59.940 FPS
300 frames in 5.0 seconds = 59.968 FPS
300 frames in 5.0 seconds = 59.943 FPS
300 frames in 5.0 seconds = 59.967 FPS
299 frames in 5.0 seconds = 59.742 FPS
300 frames in 5.0 seconds = 59.951 FPS
.....
...
...

3. gpustat is a simple tool to get statistics for Nvidia GPUs on Linux and FreeBSD Unix.

It is written in Python and is an ideal tool for CLI users, especially ML / AI developers.

It can be installed as follows using PIP:

$ pip install gpustat
$ pip3 install gpustat

Run it like this:

$ gpustat
$ gpustat -cp

Here we will see the name of the running process and their PID running on the Nvidia GPU:

itsecforu-wks01                         Tue Nov 24 15:46:37 2020  450.80.02
[0] GeForce GTX 1650 with Max-Q Design | 39'C,  ?? %,   2 % |   962 /  3911 MB | Xorg/2454(100M) Xorg/3504(325M) gnome-shell/3689(181M) firefox/4614(1M) firefox/5036(1M) firefox/5143(1M)

Reference:

$ gpustat -h

4. intel_gpu_top – Displays a quick overview of Intel GPU usage on Linux.

First install the tool, run:

$ sudo apt install intel-gpu-tools
## CentOS/RHEL/Fedora ##
$ sudo dnf install intel-gpu-tools

Fedora, RHEL and CentOS users can use the podman command to install it:

$ podman run --rm --priviledged registry.freedesktop.org/drm/igt-gpu-tools/igt:master

The tool collects data using performance counters (PMUs) provided by the i915 and other platform drivers such as RAPL (power) and Uncore IMC (memory bandwidth). Run it on Linux system as follows:

$ sudo intel_gpu_top

5.nvidia-smi – NVIDIA System Management Interface program.

Nvidia-smi provides monitoring and control capabilities for each of NVIDIA Tesla, Quadro, GRID and GeForce devices from Fermi architectures and above. GeForce Titan series devices support most of the features, and very limited information is provided on the rest of the Geforce brand. NVSMI is a cross-platform tool that supports all standard Linux and FreeBSD supported by NVIDIA drivers. After installing the Nvidia driver on Ubuntu Linux, install it as follows:

$ apt install nvidia-smi

Open terminal and run:

$ nvidia-smi -q -g 0 -d UTILIZATION -l 1
$ sudo nvidia-smi
$ nvidia-smi --help

Here’s what we’ll see:

Tue Nov 24 15:57:43 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 165...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   40C    P8     3W /  N/A |   1011MiB /  3911MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2454      G   /usr/lib/xorg/Xorg                100MiB |
|    0   N/A  N/A      3504      G   /usr/lib/xorg/Xorg                357MiB |
|    0   N/A  N/A      3689      G   /usr/bin/gnome-shell              179MiB |
|    0   N/A  N/A      4614      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      5036      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      5143      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      6406      G   ...AAAAAAAA== --shared-files      225MiB |
|    0   N/A  N/A     14462      G   ...AAAAAAAA== --shared-files      131MiB |
+-----------------------------------------------------------------------------+

6.nvtop – NVIDIA GPU statistics

Another interesting and very useful tool for NVIDIA GPUs.

This is an ncurses-based GPU status viewer for NVIDIA GPUs, similar to the htop command or the top command.

We can install it like this:

$ apt install nvtop
## запуск ##
$ nvtop

7. radeontop is a tool showing AMD GPU usage on Linux.

Install it like this:

$ sudo apt install radeontop
$ sudo radeontop

It works with R600 and above GPUs.

Works with both open source AMD drivers and AMD Catalyst cloud source drivers.

Conclusion

You learned about the various Linux GPU commands and tools for monitoring and diagnostics on Linux and BSD based systems.

Let me know if I missed your favorite tool in the comment section below.

Sidebar