Compress and deduplicate storage using Virtual Data Optimizer (VDO)
In this guide, you will learn how to use the Virtual Data Optimizer (VDO) to compress and deduplicate data on storage devices to ensure optimal storage space. RHEL / CentOS and Fedora Linux distributions have an installable Virtual Data Optimizer (VDO) Linux device mapper.
VDO optimizes the data footprint on block devices by reducing disk space usage on block devices and minimizing data replication, saving disk space and even increasing data throughput. VDO contains two kernel modules:
- v Module-transparently control data compression,
- Uz Module-handles deduplication.
The VDO layer is placed on top of existing block storage devices (such as local disks, RAID devices, encryption devices). Storage layers such as file systems and LVM logical volumes are then placed on top of VDO devices.
Step 1: Install the Virtual Data Optimizer (VDO)
For RHEL and CentOS Linux distributions, you can easily install the Virtual Data Optimizer (VDO) Linux Device Mapper by running the following command.
sudo yum -y install vdo kmod-kvdo
Wait for the installation to complete.
Loaded plugins: fastestmirror
Determining fastest mirrors
* base: centos.mirror.liquidtelecom.com
* extras: centos.mirror.liquidtelecom.com
* updates: centos.mirror.liquidtelecom.com
base | 3.6 kB 00:00
extras | 2.9 kB 00:00
updates | 2.9 kB 00:00
(1/2): extras/7/x86_64/primary_db | 159 kB 00:06
(2/2): updates/7/x86_64/primary_db | 6.7 MB 00:40
Resolving Dependencies
--> Running transaction check
---> Package kmod-kvdo.x86_64 0:6.1.2.41-5.el7 will be installed
---> Package vdo.x86_64 0:6.1.2.41-4.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
===========================================================================
Package Arch Version Repository Size
===========================================================================
Installing:
kmod-kvdo x86_64 6.1.2.41-5.el7 base 317 k
vdo x86_64 6.1.2.41-4.el7 base 624 k
Transaction Summary
===========================================================================
Install 2 Packages
Total download size: 941 k
Installed size: 4.5 M
Downloading packages:
(1/2): kmod-kvdo-6.1.2.41-5.el7.x86_64.rpm | 317 kB 00:01
(2/2): vdo-6.1.2.41-4.el7.x86_64.rpm | 624 kB 00:24
---------------------------------------------------------------------------
Total 38 kB/s | 941 kB 00:24
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : kmod-kvdo-6.1.2.41-5.el7.x86_64 1/2
Installing : vdo-6.1.2.41-4.el7.x86_64 2/2
Verifying : kmod-kvdo-6.1.2.41-5.el7.x86_64 1/2
Verifying : vdo-6.1.2.41-4.el7.x86_64 2/2
Installed:
kmod-kvdo.x86_64 0:6.1.2.41-5.el7 vdo.x86_64 0:6.1.2.41-4.el7
Complete!
Step 2: Create a VDO volume
VDO volume It is a logical device created using VDO. They are like disk partitions. You just need to format it with a file system and then you can mount VDO volumes like a regular file system. If you prefer LVM, you can use VDO volumes as LVM physical volumes.
I have a 10 GB disk that will be used for this exercise.
$ lsblk /dev/sdb
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 10G 0 disk
This is how to create a VDO volume:
$ sudo vdo create --name myvdo --device /dev/sdb --vdoLogicalSize 5G
Creating VDO myvdo
Starting VDO myvdo
Starting compression on VDO myvdo
VDO instance 0 volume is ready at /dev/mapper/myvdo
where:
- Myvdo Is the name of the logical device provided to the user by VDO.
- / dev / sdb Is the block device to be used by the VDO volume
- 5G Is the logical size of the VDO volume. This is optional and can be larger than the physical size of the actual block device.
Displays a list of started and unstarted volumes.
$ sudo vdo list --all
myvdo
Run by vdo status Command to analyze the volume.
$ sudo vdo status -n myvdo
VDO status:
Date: '2020-02-12 12:42:04+03:00'
Node: repo-server-01.safaricom.net
Kernel module:
Loaded: true
Name: kvdo
Version information:
kvdo version: 6.1.2.41
Configuration:
File: /etc/vdoconf.yml
Last modified: '2020-02-12 12:39:36'
VDOs:
myvdo:
Acknowledgement threads: 1
Activate: enabled
Bio rotation interval: 64
Bio submission threads: 4
Block map cache size: 128M
Block map period: 16380
.......
Compression and deduplication should be enabled.
$ sudo vdo status -n myvdo | egrep 'Compression|Deduplication'
Compression: enabled
Deduplication: enabled
You can use the command vdo growLogical to expand an existing volume. I will increase the capacity to 10GB.
sudo vdo growLogical -n myvdo --vdoLogicalSize 10G
confirm:
$ sudo vdo status -n myvdo | grep size
Block map cache size: 128M
Block size: 4096
Logical size: 10G
Physical size: 10G
Read cache size: 0M
Slab size: 2G
block map cache size: 134217728
block size: 4096
Step 3: Format the VDO volume using the file system.
You can format VDO volumes with your chosen file system type or create PV, VG and LV from them.
$ sudo mkfs.xfs /dev/mapper/myvdo
For LVM creation:
# Create PV
$ sudo pvcreate /dev/mapper/myvdo
Physical volume "/dev/mapper/myvdo" successfully created.
# Create VG
$ sudo vgcreate vg01 /dev/mapper/myvdo
Volume group "vg01" successfully created
# Create LV
$ sudo lvcreate -n lv01 -l+100%FREE vg01
Logical volume "lv01" created.
# Create a file system
$ sudo mkfs -t xfs /dev/mapper/vg01-lv01
meta-data=/dev/mapper/vg01-lv01 isize=512 agcount=4, agsize=655104 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=2620416, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
You can now register and install new devices.
sudo udevadm settle
sudo mkdir /myvdo
--- For standard VDO volume ---
$ sudo mount /dev/mapper/myvdo1 /myvdo
--- For LVM ---
$ sudo mount /dev/mapper/vg01-lv01 /myvdo
For a persistent installation, configure / etc / fstab file.
UUID=XXXXX /myvdo xfs defaults,x-systemd.requires=vdo.service 0 0
You can also display statistics in an easy-to-understand format.
$ sudo vdostats --human-readable
Device Size Used Available Use% Space saving%
/dev/mapper/myvdo 10.0G 4.0G 6.0G 40% 98%
Step 4: Test deduplication:
I will download an ISO file to test deduplication.
wget http://mirror.centos.org/centos/7/os/x86_64/images/boot.iso
Copy the files to the / myvdo directory.
sudo cp boot.iso /myvdo/boot1.iso
Check storage statistics.
--- Before copy ---
$ sudo vdostats --human-readable
Device Size Used Available Use% Space saving%
/dev/mapper/myvdo 10.0G 4.0G 6.0G 40% 98%
--- After copy ---
$ sudo vdostats --human-readable
Device Size Used Available Use% Space saving%
/dev/mapper/myvdo 10.0G 4.2G 5.8G 42% 7%
You will notice used The field was increased from 4.0G to 4.2G because we copied the files to a volume that took up some space.
Let’s copy the same file.
sudo cp boot.iso /myvdo/boot2.iso
Check the volume statistics again.
$ sudo vdostats --human-readable
Device Size Used Available Use% Space saving%
/dev/mapper/myvdo 10.0G 4.2G 5.8G 42% 52%
You can see that the volume space used has not changed. But save space The increase to 52% proves that deduplication has occurred to reduce the space consumption of redundant copies of the same file.
More guides:
Configure user password aging / expiration policy in Linux
How to use tuned-adm to optimize Linux system performance
Retain system journal records with persistent storage
Understanding the Linux file system hierarchy
How to create hard and soft (symbolic) links in Linux