Install Apache Cassandra on CentOS 8 Linux

How do I install Apache Cassandra on a CentOS 8 Linux machine? Apache Cassandra is a free and open source NoSQL database management system designed to be distributed and highly available. Cassandra can process large amounts of data on many commodity servers without any single point of failure.

This guide will guide you through the process of installing Cassandra on CentOS 8. After the installation is complete, we will proceed with the configuration and tuning of Cassandra to work with the machine with the least resources available.

Features of Cassandra

Cassandra provides Cassandra Query Language (CQL), a language similar to SQL,
Create and update database schemas and access data. CQL allows users
Use the following command to organize the data in the Cassandra node cluster:

  • Key space: Define how the dataset is copied, for example
    Data center and how many copies. The key space contains tables.
  • table: Defines the typed architecture of the partition collection. Cassandra
    The table has zero downtime, giving you the flexibility to add new columns to the table.
    A table contains partitions, partitions contain partitions, and partitions contain columns.
  • Divide: Define the mandatory part of all rows in the primary key
    Cassandra must have. All performance queries provide partition keys in it
    Inquire.
  • Row: Collection containing columns identified by unique primary keys
    Consists of partition keys and optional other cluster keys.
  • column: Has a single baseline that belongs to the type of row.

Cassandra supports the following client drivers:

  • Java
  • python
  • ruby
  • C # /. NET
  • Nodejs
  • PHP
  • C ++
  • Scala
  • Clojure
  • Erlang
  • go
  • Haskell
  • rust
  • Pell
  • Elixir of life
  • dart

Install Apache Cassandra on CentOS 8

Java is required to run Cassandra on CentOS 8. At the time of writing, the required Java version is 8. If you want to use cqlsh, you need the latest version of Python 2.7.

Step 1: Install Java 8 and Python:

sudo yum -y install epel-release python2 java-1.8.0-openjdk-devel

Confirm that Java and Python are installed.

$ java -version
openjdk version "1.8.0_242"
OpenJDK Runtime Environment (build 1.8.0_242-b08)
OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode)

$ python2.7 --version
Python 2.7.16

Step 2: Install Apache Cassandra on CentOS 8

Java and Python are now installed. Now, add the Cassandra repository to our CentOS system.

sudo tee  /etc/yum.repos.d/cassandra.repo <

Install Apache Cassandra using the following command.

sudo yum -y install cassandra

Create Cassandra service.

sudo tee /etc/systemd/system/cassandra.service<

Start and enable services to start at boot time.

sudo systemctl daemon-reload
sudo systemctl start cassandra.service
sudo systemctl enable cassandra

Check service status:

$ systemctl status cassandra.service
● cassandra.service - Apache Cassandra
   Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-03-04 22:24:31 EAT; 2s ago
 Main PID: 8758 (java)
    Tasks: 10 (limit: 26213)
   Memory: 3.9G
   CGroup: /system.slice/cassandra.service
           └─8758 java -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+AlwaysPreTouch -XX:-Us>

Mar 04 22:24:31 cent8.localdomain systemd[1]: Started Apache Cassandra.

You can also verify that Cassandra is running with the following command.

$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  127.0.0.1  70 KiB     256          100.0%            0daf41fa-22e5-4471-bc00-9aed6f566235  rack1

To run a query on Cassandra, call the CQL shell with the following command.

$ cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.6 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> 
  • The default location of the configuration file is / etc / cassandra.
  • The default location of the log and data directories is / var / log / cassandra / with / var / lib / cassandra.

Configure Cassandra

To run Cassandra on a single node, the default configuration file is /etc/cassandra/conf/cassandra.yaml. For a node cluster setup, you may need to modify this file to ensure that your cluster is properly tuned.

At least you
You should consider setting the following properties:

  • cluster_name: The name of the cluster.
  • seed: Comma-separated list of cluster seed IP addresses.
  • storage_port: You don't necessarily need to change this setting, but make sure there is no firewall blocking this port.
  • listen_address: The IP address of the node. This is the address that allows other nodes to communicate with this node, so it is important to change it.
  • native_transport_port: For storage_port, make sure this port is not blocked by the firewall, because clients will communicate with Cassandra on this port.

Change directory location

Configure the yaml file to control the following data directories.

  • data_file_directories: One or more directories where the data files are located.
  • commitlog_directory: The directory where the commit log file is located.
  • Saved cache directory: The directory where the saved cache is located.
  • hints_directory: Prompt the directory.

For performance reasons, if you have multiple disks, consider placing the commit log and data files on different disks.

Setting environment variables

JVM level settings (such as heap size) are at cassandra-env.sh. Consider adding any other JVM command line parameters to JVM_OPTS Environment variable. These parameters are passed to the Cassandra service at startup.

Cassandra Log

The logger in use is logback. You can change the logging properties in the following ways:ogback.xml. By default, it will log in at the INFO level to a file named System log And enter the file debug.log at the debug level. When running in the foreground, it will also log in to the console at the INFO level.

Please refer to the official guide Client configuration.

Sidebar