Install and configure Apache Kafka on Ubuntu 20.04

To
You can download this article in PDF format via the link below to support us.

Download the guide in PDF format

turn off
To

To
To

In the previous guide, we learned how to install Apache Kafka on CentOS 8 and keep it consistent, we will learn how to install it on Ubuntu 20.04 (Focal Fossa). We have introduced many definitions, but let us provide them again here just to save your time and add some convenience.

Apache Kafka is an open source distributed event streaming platform used by thousands of companies for high-performance data pipelines, stream analysis, data integration and mission-critical applications. Apache Kafka.

Let us explain all these meanings step by step. Events record the fact that “something happened” in your business and recorded digitally.

Event streaming is a practice of capturing data in real time from event sources (producers) (such as databases, sensors, mobile devices, cloud services, and software applications) in the form of event streams. Store these event streams persistently for later retrieval; process, process and respond to event streams in real time and retrospectively; and route event streams to different target technologies (consumers) as needed. Resources: Apache Kafka.

A producer is a program/application or entity that sends data to the Kafka cluster. The user sits on the other side and receives data from the Kafka cluster. A Kafka cluster can consist of one or more Kafka agents located on different servers.

“Everyone has beauty, but not everyone can see it.” β€”Confucius

Define other terms you will encounter

  • Subject: The subject is the general name used to store and publish a specific data stream. For example, if you want to store all the data about the page that was clicked, you can give the subject a name, such as “Added Customers”.
  • Partitions: Each topic is divided into multiple partitions (“baskets”). When creating a topic, you need to specify the number of partitions, but you can increase the number of partitions as needed in the future. Each message is stored in the partition with an incremental ID (called an offset value).
  • Kafka proxy: Each server with Kafka installed is called a proxy. It is a container that contains several topics with partitions.
  • Zookeeper: Zookeeper manages Kafka’s cluster status and configuration.

Apache Kafka use cases

Here are some applications that can take advantage of Apache Kafka:

  • Message interruption: Compared with most messaging systems, Kafka has better throughput, built-in partitioning, replication and fault tolerance, which makes it an ideal solution for large-scale message processing applications
  • Website activity tracking
  • Log aggregation: Kafka extracts the detailed information of files and uses log or event data as a message flow for a clearer abstraction.
  • Stream processing: Capture data from event sources in real time; store these event streams persistently for later retrieval; and route event streams to different target technologies as needed
  • Event source: This is an application design style in which state changes are recorded in chronological order.
  • Commit log: Kafka can be used as an external commit log for distributed systems. This log helps to replicate data between nodes and acts as a resynchronization mechanism for failed nodes to recover their data.
  • Metrics: This involves aggregating statistical information from distributed applications to generate a centralized operational data feed.

Install Apache Kafka on Ubuntu 20.04

Apache Kafka requires Java to run, so we will prepare the server and install all the necessary components.

Step 1: Prepare the Ubuntu server

Update the new Ubuntu 20.04 server and install Java as shown below.

sudo apt update && sudo apt upgrade
sudo apt install default-jre wget git unzip -y
sudo apt install default-jdk -y

Step 2: Get Kafka on Ubuntu 20.04

After installing Java correctly, let us now get the source code of Kafka. Go to Download And look for “Latest Version” and get the source under “Binary Download”. Click on the page recommended by Kafka and you will be redirected to the page that contains the link you can use to get it.

cd ~
wget https://downloads.apache.org/kafka/2.6.0/kafka_2.13-2.6.0.tgz
sudo mkdir /usr/local/kafka-server && cd /usr/local/kafka-server
sudo tar -xvzf ~/kafka_2.13-2.6.0.tgz --strip 1

Since the -strip 1 flag is set, the archive content will be extracted to /usr/local/kafka-server/.

Step 3: Create Kafka and Zookeeper system unit files

The Systemd unit files of Kafka and Zookeeper will be of great help in performing common service operations such as starting, stopping and restarting Kafka. This makes it suitable for other services to start, stop and restart, which is beneficial and consistent.

Let’s start with the Zookeeper service:

$ sudo vim /etc/systemd/system/zookeeper.service

[Unit]
Description=Apache Zookeeper Server
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties
ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Then proceed to Kafka service. Ensure that the JAVA_HOME configuration is entered correctly, otherwise Kafka cannot be started.

$ sudo vim /etc/systemd/system/kafka.service

[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64"
ExecStart=/usr/local/kafka-server/bin/kafka-server-start.sh /usr/local/kafka-server/config/server.properties
ExecStop=/usr/local/kafka-server/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

After adding the configuration, reload the systemd daemon to apply the changes, and then start the service. You can also check its status.

sudo systemctl daemon-reload
sudo systemctl enable --now zookeeper
sudo systemctl enable --now kafka
sudo systemctl status kafka zookeeper

Step 4: Install Cluster Manager for Apache Kafka (CMAK) | Kafka Manager

CMAK (formerly known as Kafka Manager) is an open source tool for managing Apache Kafka clusters developed by Yahoo.

cd ~
git clone https://github.com/yahoo/CMAK.git

Step 5: Configure CMAK on Ubuntu 20.04

The minimum configuration is the Zookeeper host that will be used for the CMAK (pka kafka manager) state. It can be found in the application.conf file in the conf directory. Change cmak.zkhosts = “my.zookeeper.host.com:2181”, you can also specify multiple Zookeeper hosts separated by commas, for example: cmak.zkhosts = “my.zookeeper.host.com:2181,other.zookeeper .host.com: 2181”. The host name can also be an IP address.

$ vim ~/CMAK/conf/application.conf
cmak.zkhosts="localhost:2181

After adding the Zookeeper host, the following command will create a zip file that can be used to deploy the application. When downloading and compiling the file, you should see a lot of output on the terminal. Give it some time to finish and compile, because it will take a while.

cd ~/CMAK/
./sbt clean dist

After all operations are complete, you should see a message similar to the following:

[info] Your package is ready in /home/tech/CMAK/target/universal/cmak-3.0.0.5.zip

Switch to the directory where the zip file is located and unzip:

$ cd ~/CMAK/target/universal
$ unzip cmak-3.0.0.5.zip
$ cd cmak-3.0.0.5

Step 5: Start the service and access it

After unzipping the resulting zipfile and following the steps below to change the working directory to it Step 4, You can run the service like this:

$ cd ~/CMAK/target/universal/cmak-3.0.0.5
$ bin/cmak

By default, it will select port 9000, so open your favorite browser and point it to http: // ip-or-domain-name-of-server: 9000. If your firewall is running, please allow external access to this port.

sudo ufw allow 9000

After everything is normal, you should see an interface as shown below:

You will immediately notice that when we first entered the above interface, there were no clusters available. Therefore, we will continue to create a new cluster. Click “cluster“Drop-down list and select”Add cluster“.

Install and configure Apache Kafka on Ubuntu 20.04

You will see a page as shown below. Fill in the required details (cluster name, Zookeeper host, etc.) in the form. If you have multiple Zookeeper hosts, please add them separated by commas. You can fill in other details as needed.

Install and configure Apache Kafka on Ubuntu 20.04

Sample fill field

Install and configure Apache Kafka on Ubuntu 20.04

When everything is satisfied, scroll down and click “save“.

Install and configure Apache Kafka on Ubuntu 20.04

Step 6: Add a sample theme

Apache Kafka provides multiple shell scripts that can be used. Let’s first create an example topic called “ComputingForGeeksTopic” with a single copy and a single copy. Open a new terminal for CMAK to run and issue the following command:

cd /usr/local/kafka-server
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic ComputingForGeeksTopic

Created topic ComputingForGeeksTopic.

Confirm whether the theme is updated in the CMAK interface. To do this, in the cluster, click Topic>List

Install and configure Apache Kafka on Ubuntu 20.04

Step 7: Create a theme in the CMAK interface

Another easier way to create themes is through the CMAK web interface. Just click “topic“Drop-down list and click”create“. As shown below. You will need to enter all the details about the new topic (replication factor, partition, etc.). Fill out the form and click “create“Below it.

Install and configure Apache Kafka on Ubuntu 20.04

You will need to enter all the details about the new topic (replication factor, partition, etc.). Fill out the form and click “create“, you will see it at the bottom of the page, as shown below:

Install and configure Apache Kafka on Ubuntu 20.04

You will see a message indicating that your topic has been created, as shown below. A link to view it will also be provided. Click on it to see your new theme

Install and configure Apache Kafka on Ubuntu 20.04

There is:

Install and configure Apache Kafka on Ubuntu 20.04

in conclusion

Apache Kafka is now installed on the Ubuntu 20.04 server. It should be noted that Kafka can be installed on multiple servers to create a cluster. Otherwise, thank you for your visit and keep paying attention to the end. Thank you for your continued support. understand more Apache KafkaLearn more about Apache Kafka’s cluster manager

Find other amazing guides below:

Install and configure Apache Kafka with CMAK on CentOS 8

To
You can download this article in PDF format via the link below to support us.

Download the guide in PDF format

turn off
To

To
To

Sidebar