It functions as a pub/sub-system where producer applications publish messages, and consumer systems subscribe to them. Apache Kafka enables you to adopt a loosely coupled architecture between the parts of your system that produce and consume data. This makes designing and managing the system simpler. Kafka relies on Zookeeper for metadata management and synchronization of different elements of the cluster.
Features of Apache Kafka
Apache Kafka has grown popular, among other reasons, for being
Scalable through clusters and partitions Fast capable of performing 2 million writes per second Maintains the order in which messages are sent Reliable through its system of replicas It can be upgraded with zero downtime
Now, let’s explore some of the common use cases of Kafka.
Common Use Cases of Apache Kafka
Kafka is often used in processing big data, Recording and aggregating events such as button clicks for analytics, and Combining logs from different parts of a system into one central location. It helps in enabling communication between different applications in a system and real-time processing of data from IoT devices. Now, let’s check out the detailed steps to install Kafka on Windows and Linux.
Installing Kafka on Windows
First, check if Java is installed on your machine to install Apache Kafka on Windows. Open up the command prompt in Administrator mode and enter the command: If Java is installed, you should get the JDK version number currently installed. If you get an error message saying the command was not recognized, Java was not installed, and you need to install Java. To install Java, head to Adoptium.net and click on the download button. This should download the Java installer file. When downloading is complete, run the installer. This should open up the installation prompt. Press, Next repeatedly to choose the default options. Installation should then begin. Verify installation by closing the command prompt, reopening another command prompt in Administrator mode, and entering the command: This time, you should get the JDK version you just installed. After installation is complete, we can begin installing Kafka. To install Kafka, first go to the Kafka website. Click on the link, and it should take you to the Downloads page. Download the latest binaries available. This will download Kafka scripts and binaries packaged in .tgz file. After downloading, you must extract the files from the .tgz archive. To extract, I will use WinZip, which can be downloaded from the WinZip website. After extracting the file, move it to the C:\ such that the file path becomes C:\kafka Then open the command prompt in Administrator mode and start Zookeeper by first navigating to the Kafka directory. And running the zookeeper-server-start.bat file with zookeeper.properties as the configuration file With Zookeeper running, we need to add the wmic executable file that Kafka uses in our system PATH, After this, start the Apache Kafka server by opening another command prompt session in Administrator mode and navigating to the C:\kafka folder Then start Kafka by running With this, Kafka should be running. You can customize server properties, such as where the logs are written in the server.properties file.
Installing Kafka on Linux
First, ensure that your system is up-to-date by updating all packages Next, check if Java is installed on your machine by running If java is installed, you will see the version number. However, if it is not, you can install it using apt. After this, we can install Apache Kafka by downloading the binaries from the website. Open your terminal and navigate to the folder where the download was saved. In my case, I have to navigate to the Downloads folder. Once in the downloads folder, extract the downloaded files using tar: Navigate to the extracted folder List the directories and files. Once in the folder, start a Zookeeper server by running the zookeeper-server-start.sh script located in the bin directory of the extracted folder. The script will require a Zookeeper configuration file. The default file is called zookeeper.properties and is located in the config subdirectory. So to start the server, use the command: With Zookeeper running, we can start the Apache Kafka server. The kafka-server-start.sh script is also located in the bin directory. The command also expects a configuration file. The default one is server.properties stored in the config file. This should get Apache Kafka running. Inside the bin directory, you will find many scripts to do things such as create topics, manage producers and manage consumers. You can also customize server properties in the server.properties file.
Final Words
Next, you can learn data processing with Kafka and Spark.