How do I install Apache Cassandra on a CentOS 8 Linux machine? Apache Cassandra is a free and open source NoSQL database management system designed to be distributed and highly available. Cassandra can process large amounts of data on many commodity servers without any single point of failure.
This guide will guide you through the process of installing Cassandra on CentOS 8. After the installation is complete, we will proceed with the configuration and tuning of Cassandra to work with the machine with the least resources available.
Features of Cassandra
Cassandra provides Cassandra Query Language (CQL), a language similar to SQL,
Create and update database schemas and access data. CQL allows users
Use the following command to organize the data in the Cassandra node cluster:
- Key space: Define how the dataset is copied, for example
Data center and how many copies. The key space contains tables.
- table: Defines the typed architecture of the partition collection. Cassandra
The table has zero downtime, giving you the flexibility to add new columns to the table.
A table contains partitions, partitions contain partitions, and partitions contain columns.
- Divide: Define the mandatory part of all rows in the primary key
Cassandra must have. All performance queries provide partition keys in it
- Row: Collection containing columns identified by unique primary keys
Consists of partition keys and optional other cluster keys.
- column: Has a single baseline that belongs to the type of row.
Cassandra supports the following client drivers:
- C # /. NET
- C ++
- Elixir of life
Install Apache Cassandra on CentOS 8
Java is required to run Cassandra on CentOS 8. At the time of writing, the required Java version is 8. If you want to use cqlsh, you need the latest version of Python 2.7.
Step 1: Install Java 8 and Python:
sudo yum -y install epel-release python2 java-1.8.0-openjdk-devel
Confirm that Java and Python are installed.
$ java -version openjdk version "1.8.0_242" OpenJDK Runtime Environment (build 1.8.0_242-b08) OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode) $ python2.7 --version Python 2.7.16
Step 2: Install Apache Cassandra on CentOS 8
Java and Python are now installed. Now, add the Cassandra repository to our CentOS system.
sudo tee /etc/yum.repos.d/cassandra.repo <
Install Apache Cassandra using the following command.
sudo yum -y install cassandra
Create Cassandra service.
sudo tee /etc/systemd/system/cassandra.service<
Start and enable services to start at boot time.
sudo systemctl daemon-reload sudo systemctl start cassandra.service sudo systemctl enable cassandra
Check service status:
$ systemctl status cassandra.service ● cassandra.service - Apache Cassandra Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2020-03-04 22:24:31 EAT; 2s ago Main PID: 8758 (java) Tasks: 10 (limit: 26213) Memory: 3.9G CGroup: /system.slice/cassandra.service └─8758 java -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+AlwaysPreTouch -XX:-Us> Mar 04 22:24:31 cent8.localdomain systemd: Started Apache Cassandra.
You can also verify that Cassandra is running with the following command.
$ nodetool status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 70 KiB 256 100.0% 0daf41fa-22e5-4471-bc00-9aed6f566235 rack1
To run a query on Cassandra, call the CQL shell with the following command.
$ cqlsh Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.6 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help. cqlsh>
- The default location of the configuration file is / etc / cassandra.
- The default location of the log and data directories is / var / log / cassandra / with / var / lib / cassandra.
To run Cassandra on a single node, the default configuration file is /etc/cassandra/conf/cassandra.yaml. For a node cluster setup, you may need to modify this file to ensure that your cluster is properly tuned.
At least you
You should consider setting the following properties:
- cluster_name: The name of the cluster.
- seed: Comma-separated list of cluster seed IP addresses.
- storage_port: You don't necessarily need to change this setting, but make sure there is no firewall blocking this port.
- listen_address: The IP address of the node. This is the address that allows other nodes to communicate with this node, so it is important to change it.
- native_transport_port: For storage_port, make sure this port is not blocked by the firewall, because clients will communicate with Cassandra on this port.
Change directory location
Configure the yaml file to control the following data directories.
- data_file_directories: One or more directories where the data files are located.
- commitlog_directory: The directory where the commit log file is located.
- Saved cache directory: The directory where the saved cache is located.
- hints_directory: Prompt the directory.
For performance reasons, if you have multiple disks, consider placing the commit log and data files on different disks.
Setting environment variables
JVM level settings (such as heap size) are at cassandra-env.sh. Consider adding any other JVM command line parameters to
JVM_OPTS Environment variable. These parameters are passed to the Cassandra service at startup.
The logger in use is logback. You can change the logging properties in the following ways:ogback.xml. By default, it will log in at the INFO level to a file named System log And enter the file debug.log at the debug level. When running in the foreground, it will also log in to the console at the INFO level.
Please refer to the official guide Client configuration.