Why Kafka? A Developer-Friendly Guide to Event-Driven Architecture

What is Kafka? Kafka is an open-source distributed event streaming platform designed for handling real-time data feeds. Originally developed at LinkedIn and later open-sourced under the Apache Software Foundation, Kafka is now widely used for building high-throughput, fault-tolerant, and scalable data pipelines, real-time analytics, and event-driven architectures. What Problem Does Kafka Solve? Before Kafka, traditional message queues like RabbitMQ and ActiveMQ were widely used, but they had limitations in handling massive, high-throughput real-time data streams. Kafka was designed to address these issues by providing: Large-scale data handling – Kafka is optimized for ingesting, storing, and distributing high-volume data streams across distributed systems. Fault tolerance – Kafka replicates data across multiple nodes, ensuring that even if a broker fails, data remains available. Durability – Messages persist on disk, allowing consumers to replay events when needed. Support for event-driven architecture – It enables asynchronous communication between microservices, making it ideal for modern cloud applications. When to Use Kafka Kafka is the right choice when you need: High-throughput, real-time data processing – Ideal for log processing, financial transactions, and IoT data streams. Microservices decoupling – Kafka acts as an intermediary, allowing microservices to communicate asynchronously without direct dependencies. Event-driven systems – If your architecture revolves around reacting to changes (e.g., a user event triggering multiple downstream actions), Kafka is a solid choice. Reliable message delivery with persistence – Unlike traditional message queues that may drop messages, Kafka retains messages for a configurable period, ensuring durability and replayability. Scalability and fault tolerance – Kafka’s distributed nature allows it to scale horizontally while maintaining fault tolerance through replication. How Kafka Works Kafka consists of several key components: 1. Message A message is the smallest unit of data in Kafka. It can be a JSON object, a string, or any binary data. Messages may have an associated key, which determines which partition the message will be stored in. 2. Topic A topic is a logical channel where messages are sent by producers and read by consumers. Topics help categorize messages (e.g., logs, transactions, orders). 3. Producer A producer is a Kafka client that publishes messages to a topic. Messages can be sent in three ways: Fire and forget – The producer sends the message without waiting for confirmation, ensuring maximum speed but risking data loss. Synchronous send – The producer waits for an acknowledgment from Kafka before proceeding, ensuring reliability but adding latency. Asynchronous send – The producer sends messages in batches asynchronously, offering a balance between speed and reliability. Kafka allows configuring acknowledgments (ACKs) to balance consistency and performance: ACK 0 – No acknowledgment required (fastest but riskier). ACK 1 – The message is acknowledged when the leader broker receives it (faster but less safe). ACK All – The message is acknowledged only when all replicas confirm receipt (slower but safest). Producer Optimizations Message Compression & Batching – Kafka producers can batch and compress messages before sending them to brokers. This improves throughput and reduces disk usage but increases CPU overhead. Avro Serializer/Deserializer – Using Avro instead of JSON requires defining schemas upfront, but it improves performance and reduces storage consumption. 4. Partition Kafka topics are divided into partitions, which allow for parallel processing and scalability. Messages in a partition are ordered and immutable. 5. Consumer A consumer reads messages from partitions and keeps track of its position using an offset. Consumers can reset offsets to reprocess older messages. Kafka consumers work on a polling model, meaning they continuously request data from the broker rather than the broker pushing data to them. Consumer Optimization Partition Assignment Strategies: Range – Consumers get consecutive partitions. Round Robin – Partitions are evenly distributed across consumers. Sticky – Tries to minimize changes during rebalancing. Cooperative Sticky – Like Sticky but allows cooperative rebalancing. Batch Size Configuration – Consumers can define how many records or how much data should be retrieved per poll cycle. 6. Consumer Group A consumer group is a set of consumers that work together to process messages from a topic. Kafka ensures that a single partition is consumed by only one consumer within a group, maintaining order. 7. Offset Management When a consumer reads a m

Feb 24, 2025 - 18:28

Why Kafka? A Developer-Friendly Guide to Event-Driven Architecture

What is Kafka?

Kafka is an open-source distributed event streaming platform designed for handling real-time data feeds.

Originally developed at LinkedIn and later open-sourced under the Apache Software Foundation, Kafka is now widely used for building high-throughput, fault-tolerant, and scalable data pipelines, real-time analytics, and event-driven architectures.

What Problem Does Kafka Solve?

Before Kafka, traditional message queues like RabbitMQ and ActiveMQ were widely used, but they had limitations in handling massive, high-throughput real-time data streams.

Kafka was designed to address these issues by providing:

Large-scale data handling – Kafka is optimized for ingesting, storing, and distributing high-volume data streams across distributed systems.
Fault tolerance – Kafka replicates data across multiple nodes, ensuring that even if a broker fails, data remains available.
Durability – Messages persist on disk, allowing consumers to replay events when needed.
Support for event-driven architecture – It enables asynchronous communication between microservices, making it ideal for modern cloud applications.

When to Use Kafka

Kafka is the right choice when you need:

High-throughput, real-time data processing – Ideal for log processing, financial transactions, and IoT data streams.
Microservices decoupling – Kafka acts as an intermediary, allowing microservices to communicate asynchronously without direct dependencies.
Event-driven systems – If your architecture revolves around reacting to changes (e.g., a user event triggering multiple downstream actions), Kafka is a solid choice.
Reliable message delivery with persistence – Unlike traditional message queues that may drop messages, Kafka retains messages for a configurable period, ensuring durability and replayability.
Scalability and fault tolerance – Kafka’s distributed nature allows it to scale horizontally while maintaining fault tolerance through replication.

How Kafka Works

Kafka consists of several key components:

1. Message

A message is the smallest unit of data in Kafka.

It can be a JSON object, a string, or any binary data.

Messages may have an associated key, which determines which partition the message will be stored in.

2. Topic

A topic is a logical channel where messages are sent by producers and read by consumers. Topics help categorize messages (e.g., logs, transactions, orders).

3. Producer

A producer is a Kafka client that publishes messages to a topic. Messages can be sent in three ways:

Fire and forget – The producer sends the message without waiting for confirmation, ensuring maximum speed but risking data loss.
Synchronous send – The producer waits for an acknowledgment from Kafka before proceeding, ensuring reliability but adding latency.
Asynchronous send – The producer sends messages in batches asynchronously, offering a balance between speed and reliability.

Kafka allows configuring acknowledgments (ACKs) to balance consistency and performance:

ACK 0 – No acknowledgment required (fastest but riskier).
ACK 1 – The message is acknowledged when the leader broker receives it (faster but less safe).
ACK All – The message is acknowledged only when all replicas confirm receipt (slower but safest).

Producer Optimizations

Message Compression & Batching – Kafka producers can batch and compress messages before sending them to brokers. This improves throughput and reduces disk usage but increases CPU overhead.
Avro Serializer/Deserializer – Using Avro instead of JSON requires defining schemas upfront, but it improves performance and reduces storage consumption.

4. Partition

Kafka topics are divided into partitions, which allow for parallel processing and scalability.

Messages in a partition are ordered and immutable.

5. Consumer

A consumer reads messages from partitions and keeps track of its position using an offset.

Consumers can reset offsets to reprocess older messages.

Kafka consumers work on a polling model, meaning they continuously request data from the broker rather than the broker pushing data to them.

Consumer Optimization

Partition Assignment Strategies:
- Range – Consumers get consecutive partitions.
- Round Robin – Partitions are evenly distributed across consumers.
- Sticky – Tries to minimize changes during rebalancing.
- Cooperative Sticky – Like Sticky but allows cooperative rebalancing.
Batch Size Configuration – Consumers can define how many records or how much data should be retrieved per poll cycle.

6. Consumer Group

A consumer group is a set of consumers that work together to process messages from a topic.

Kafka ensures that a single partition is consumed by only one consumer within a group, maintaining order.

7. Offset Management

When a consumer reads a message, it updates its offset—the position of the last processed message.

Auto-commit – Kafka automatically commits the offset at regular intervals.
Manual commit – The application explicitly commits the offset, either synchronously or asynchronously.

8. Broker

A broker is a Kafka server that stores messages, assigns offsets, and handles client requests.

Multiple brokers form a Kafka cluster for scalability and fault tolerance.

9. Zookeeper

Zookeeper manages metadata, tracks brokers, and handles leader elections.

However, newer Kafka versions are working towards eliminating Zookeeper dependencies.

Example: Kafka in Action

To understand Kafka better, let's look at a simple example where a producer sends messages to a topic, and two different consumers process those messages separately: one simulating an email notification service and the other storing messages in a database.

Setup Kafka (docker-compose.yml)

services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    container_name: zookeeper
    restart: always
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181

  kafka:
    image: confluentinc/cp-kafka:latest
    container_name: kafka
    restart: always
    depends_on:
      - zookeeper
    ports:
      - "9092:9092"
      - "29092:29092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092,PLAINTEXT_INTERNAL://kafka:29092
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,PLAINTEXT_INTERNAL://0.0.0.0:29092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_INTERNAL:PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

Producer Code (producer.js)

const { Kafka } = require("kafkajs");

const kafka = new Kafka({
  clientId: "family-producer",
  brokers: ["localhost:9092"],
});
const producer = kafka.producer();

async function sendMessage() {
  await producer.connect();
  console.log("


                                            
                            
                                Read More                                
                            
                        
                                        
                        Tags:
                        
                                                    
                    
                    
                        
                            
                                                                    
                                        
                                            
                                            Previous Article                                        
                                    
                                    
                                        Annoy Group Chat Application with WebSocket, React, and Node.js
                                    
                                                            
                            
                                                                    
                                        
                                            Next Article                                            
                                        
                                    
                                    
                                        
                                    
                                                            
                        
                    
                                        
                        
                            
                                
                                    
                                        Related Posts
                                    
                                
                                
                                    
                                                                                            
                                                        
                                                                                                                            
                                                                    
                                                                        
                                                                                                                                            
                                                                
                                                                                                                        Inertia.js vs RESTful API: Choosing the Right Approach ...
                                                                Mar 29, 2025
     0

                                                        
                                                    
                                                                                                    
                                                        
                                                                                                                            
                                                                    
                                                                        
                                                                                                                                            
                                                                
                                                                                                                        ARIA, AI assistant at your finger tips.
                                                                Feb 12, 2025
     0

                                                        
                                                    
                                                                                                    
                                                        
                                                                                                                            
                                                                    
                                                                        
                                                                                                                                            
                                                                
                                                                                                                        Building a Smart Shelf Audit and Restocking Bot with Ui...
                                                                Mar 24, 2025
     0

                                                        
                                                    
                                                                                    
                                
                            
                        
                    
                                            
                            
                                
                                    
                                                                                    
                                                                            
                                    
                                                                                    
                                                    
        
        
        
            
                
                    Name
                    
                
                
                    Email
                    
                
            
        
        
            Comment