Apache Kafka is an open-source distributed event streaming platform. This tutorial gives you an introduction to Kafka from a beginner’s perspective.
Recently we wrote an article about “Kafka vs Redis Pub-Sub” which explains about the differences between Kafka and Redis Pub-Sub. In that article we shortly explained about the Kafka itself. So, now this article gives you more insight into the Apache Kafka and gives you all the necessary introduct to kick start with Kafka. So, let’s get started.
- Introduction to Apache Kafka
- What is a Messaging System?
- Fundamental Terms
- Installation Steps
- Create a Topic
- Write Message into Topic
- Read Messages from Topic
Introduction to Apache Kafka
Apache Kafka was started at LinkedIn and later became an open-source Apache project. Kafka is developed in Scala and Java. Apache Kafka is a publish-subscribe messaging system that let exchanging of data between applications, servers, and processors as well. It is fast, scalable, and distributed by design. It is capable of handling millions of data or messages per second. It works as a mediator between the source (producer) system and the target (consumer) system. Organizations such as NETFLIX, UBER, and thousands of such firms make use of Apache Kafka. In comparison to other messaging systems, Kafka has better throughput, built-in partitioning, replication, and inherent fault tolerance. This makes it a good fit for large-scale message processing applications.
What is a Messaging System?
A messaging system is a simple exchange of messages between two or more persons, devices, etc. It is responsible for data transfer from one application to another, so the applications can focus on data, but not worry about how to share it.
One type of messaging system is “Point to Point Messaging System”. In this system messages are stored in a queue when sent by the producer. Only one consumer can consume a message at any given time. And, once the message is consumed it’ll be removed from the queue. One of the example of this system is Order Processing System. Where only one process can proces the order.
Another type of messaging system is the pub-sub messaging system that allows a sender to send the message and a receiver to read that message. In Apache Kafka, a sender is known as a producer who publishes messages, and a receiver is known as a consumer who consumes that message by subscribing to it. An example of this system is your cable tv which publishes many channels, and anyone can subscribe to their choice of channels and get them whenever their subscribed channels are available.
Before we talk more about Apache Kafka, you must understand some of the terms.
- Topics: A stream of messages belonging to a particular category is called a topic.
- Partition: Topics may have many partitions, so they can handle an arbitrary amount of data.
- Partition offset: Each partitioned message has a unique sequence id called “offset”.
- Replicas of partition: Replicas are backups of a partition. They are used to prevent data loss.
- Brokers: are responsible for maintaining published data.
- Kafka Cluster: Kafka’s having more than one broker is called Kafka cluster.
- Producers: are the publishers of the messages to one or more topics.
- Consumers: are the reader of the data from brokers. They subscribe to multiple topics and consume whenever there is a message in the topic.
- Leader: is the node responsible for all reads and writes for any given partition.
- Follower: Node which follows leader instructions are called followers.
Step 1 – Verifying Java Installation
Hopefully you have already installed java on your machine right now, so you just verify it using the following command.
$ java -version
If java is successfully installed on your machine, you could see the version of the installed Java. If not then you should first install Java.
Step 2 – ZooKeeper Framework Installation
Visit the following link and download the latest version of ZooKeeper.
Once you are done with installation and configuration you can start the ZooKeeper with following command
$ bin/zkServer.sh start
And then use following command to connect to ZooKeeper from the cli
$ bin/zkCli.sh -server 127.0.0.1:2181
once connected you should see somthing like this
Connecting to localhost:2181 log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper). log4j:WARN Please initialize the log4j system properly. Welcome to ZooKeeper! JLine support is enabled [zkshell: 0]
Step 3 – Apache Kafka Installation
Download the latest Kafka release and extract it from following url
$ tar -xzf kafka_2.13-2.7.0.tgz $ cd kafka_2.13-2.7.0
Run the following commands in order to start all services in the correct order:
# Start the ZooKeeper service # Note: Soon, ZooKeeper will no longer be required by Apache Kafka. $ bin/zookeeper-server-start.sh config/zookeeper.properties
Open another terminal session and run:
# Start the Kafka broker service $ bin/kafka-server-start.sh config/server.properties
Create a Topic
To write your messages or events into topics you must first create a topic. To create a topic open your terminal and run the following command:
$ bin/kafka-topics.sh --create --topic weather-updates --bootstrap-server localhost:9092
Now we have created a topic with the name “weather-updates” where we’ll publish information about weather and consumers will consume those messages.
$ bin/kafka-topics.sh --describe --topic weather-updates --bootstrap-server localhost:9092 Topic:weather-updates PartitionCount:1 ReplicationFactor:1 Configs: Topic: quickstart-events Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Write Message into Topic
Run the console producer client to write a few events into your topic. By default, each line you enter will result in a separate event being written to the topic.
$ bin/kafka-console-producer.sh --topic weather-updates --bootstrap-server localhost:9092 Weather for New York is Cloudy Weather for Houston is Rainy
Read Messages from Topic
Open another terminal session and run the console consumer client to read the events you just created:
$ bin/kafka-console-consumer.sh --topic weather-updates --from-beginning --bootstrap-server localhost:9092 Weather for New York is Cloudy Weather for Houston is Rainy
Feel free to experiment: for example, switch back to your producer terminal (previous step) to write additional events, and see how the events immediately show up in your consumer terminal.
In this tutorial we have learned about the basics of Apache Kafka. We installed the Kafka and then configured it. We also learned how to create topics in Kafka, send messages to topics and how to read messages from the topics. I hope this article was useful for you. Let’s meet in next article.
You may find following links relevant: