Combining Kafka and ZooKeeper in a Microservice Architecture

Apache Kafka and ZooKeeper are two powerful tools used extensively in distributed systems. Kafka, a high-throughput distributed messaging system, relies on ZooKeeper for coordination tasks (pre-KRaft). Together, they form a robust foundation for building fault-tolerant and scalable microservice architectures.

This guide explores how Kafka and ZooKeeper work together, how ZooKeeper facilitates service state coordination, and how you can manage offsets and metadata effectively. Lastly, it discusses best practices for designing microservices that consume Kafka while coordinating via ZooKeeper.

Table of Contents

  1. Kafka’s Dependence on ZooKeeper (Pre-KRaft)
  2. Service State Coordination via ZooKeeper
  3. Managing Offsets and Service Metadata
  4. Designing Microservices That Consume Kafka and Coordinate via ZooKeeper
  5. Official Documentation Links
  6. Summary

Kafka’s Dependence on ZooKeeper (Pre-KRaft)

What Is ZooKeeper’s Role in Kafka?

Before the introduction of Kafka’s KRaft (Kafka Raft) mode, ZooKeeper served as the coordination layer for Kafka. It handled metadata management and cluster coordination, allowing Kafka brokers and clients to function smoothly in distributed environments.

Key Responsibilities of ZooKeeper in Kafka:

  1. Broker Registration
    ZooKeeper keeps track of all active brokers in the Kafka cluster. It maintains broker metadata, including their IDs and addresses.
  2. Leader Election
    Topics in Kafka are divided into partitions, and each partition has a leader broker. ZooKeeper coordinates the leader election process, ensuring high availability in case of broker failures.
  3. Configuration Management
    Kafka topics, partitions, and replicas are stored and managed in ZooKeeper. Any configuration update is made through ZooKeeper.
  4. Consumer Group Offsets
    Older versions of Kafka stored consumer offsets (last-processed messages for each partition) in ZooKeeper. While modern Kafka stores these in internal topics, ZooKeeper-based offset storage is still relevant in legacy systems.

When Kafka transitions to KRaft-based architecture, it eliminates ZooKeeper dependency by integrating a built-in metadata quorum. However, many production setups still rely on ZooKeeper for reliable coordination.

Simple Example of ZooKeeper Handling Broker Data

Using ZooKeeper CLI, you can see active brokers in the Kafka cluster:

zkCli.sh
ls /brokers/ids

This command lists all broker IDs currently active. ZooKeeper’s responsibility is to ensure this information is always up-to-date and accurate.


Service State Coordination via ZooKeeper

ZooKeeper’s strength lies in reliably coordinating distributed services. It can store metadata, maintain application state, and help services discover each other dynamically. This is essential for microservice architectures where multiple distributed components interact in real-time.

Coordinating Services with ZooKeeper

ZooKeeper allows services to share their state using znodes (data nodes in ZooKeeper’s hierarchical tree). The ephemeral nature of znodes ensures that if a service instance fails, its state is automatically removed, preventing stale data from lingering in the system.

Example Structure for Service Coordination

Consider a system where multiple microservice instances need to register their state:

/services/order-processing  
    /instance-1 (Ephemeral, Data="Healthy")  
    /instance-2 (Ephemeral, Data="Degraded")  
    /instance-3 (Ephemeral, Data="Healthy")

Each znode represents a service instance. Other services can watch these nodes for updates or failures.

Watching for Node Changes

ZooKeeper watchers help monitor znode changes, enabling real-time updates.

PathChildrenCache cache = new PathChildrenCache(client, "/services/order-processing", true);
cache.getListenable().addListener((client, event) -> {
    System.out.println("Event Type: " + event.getType() + ". Changed Path: " + event.getData().getPath());
});
cache.start();

Use Case

For example, if a payment service depends on the state of the “order-processing” service, it can dynamically reroute requests based on the znode status.


Managing Offsets and Service Metadata

Kafka consumer offsets determine which messages have already been processed by a consumer. While ZooKeeper was historically used for offset tracking, it is still applicable in specific scenarios or legacy systems.

Tracking Offsets in ZooKeeper

Offsets are typically tracked in paths like:

/consumers/{group-id}/offsets/{topic}/{partition}

For instance:

/consumers/order-group/offsets/orders-topic/0 = "12345"

Managing Service Metadata

ZooKeeper can also store metadata about Kafka clients, such as additional identifiers, versioning, or feature toggles. This metadata allows for advanced routing or state management based on each service’s context.

Example Code for Reading Offset Data

public long getOffset(String consumerGroup, String topic, int partition) {
    String path = String.format("/consumers/%s/offsets/%s/%d", consumerGroup, topic, partition);
    byte[] data = client.getData().forPath(path);
    return Long.parseLong(new String(data));
}

Automating Offset Updates

ZooKeeper watchers can also help automate actions if an offset skips or fails:

client.setData().forPath("/consumers/order-group/offsets/orders-topic/0", "12346".getBytes());

Offsets and metadata management are critical in building scalable Kafka consumers that avoid data duplication or loss.


Designing Microservices That Consume Kafka and Coordinate via ZooKeeper

Key Design Principles for Combining Kafka and ZooKeeper:

  1. Efficient Topic Consumption Design microservices to consume Kafka topics dynamically. Ensure they register and deregister their consumption state in ZooKeeper.
  2. Client State Management Use ZooKeeper for storing custom states or service health information. This ensures other services can make informed routing decisions.
  3. Load Balancing Combine Kafka’s partitioning logic with ZooKeeper’s service coordination to distribute partition consumption evenly across instances of a consumer.

Example Architecture

Here’s how you can design a microservice that consumes messages from Kafka while using ZooKeeper for coordination:

  1. The service is registered with ZooKeeper under /services/kafka-consumers/order-processing.
  2. It consumes messages from the order-events topic based on the current partition assignment.
  3. It uses ZooKeeper to store progress, such as offsets or custom metadata, under its znode.

Code Example

@Service
public class KafkaConsumerService {

    private final CuratorFramework client;

    public KafkaConsumerService(CuratorFramework client) {
        this.client = client;
    }

    public void consume(String topic) {
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(getKafkaProps());
        consumer.subscribe(Arrays.asList(topic));

        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
            for (ConsumerRecord<String, String> record : records) {
                processMessage(record);
                updateOffset(record.topic(), record.partition(), record.offset());
            }
        }
    }

    private void updateOffset(String topic, int partition, long offset) {
        String path = String.format("/consumers/my-consumer-group/offsets/%s/%d", topic, partition);
        client.setData().forPath(path, Long.toString(offset).getBytes());
    }
}

By combining the capabilities of Kafka and ZooKeeper, your microservices can handle large-scale systems with dynamic behavior seamlessly.


Official Documentation Links

  1. Apache Kafka Documentation: Kafka Docs
  2. Apache ZooKeeper Documentation: ZooKeeper Docs

Explore these resources for detailed technical insights and advanced use cases.


Summary

Combining Kafka and ZooKeeper in a microservice architecture brings robust coordination, dynamic scalability, and operational resilience:

  1. Kafka’s Dependence on ZooKeeper: Handles leader election, broker registration, and metadata management.
  2. Service State Coordination: Leverage ZooKeeper to monitor and coordinate service states dynamically.
  3. Offset Tracking: Store and manage consumer offsets with ZooKeeper for seamless consumption.
  4. Microservice Design: Register services, distribute load, and leverage both Kafka and ZooKeeper for advanced state management.

By implementing these strategies, you can design scalable and adaptive microservice architectures, ready for the challenges of modern distributed systems. Start combining Kafka and ZooKeeper to build resilient and efficient workflows today!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *