|

Distributed Tracing with Spring Sleuth, Zipkin, and ELK Stack

Tracking how a single user request flows through a distributed microservices architecture is no easy feat. Oftentimes, identifying bottlenecks or debugging errors becomes a challenge without proper tools for distributed tracing. Enter Spring Cloud Sleuth, Zipkin, and the ELK Stack (Elasticsearch, Logstash, Kibana), three powerful tools that, when combined, provide clarity into the lifecycle of requests across microservices.

This guide will walk you through setting up distributed tracing with Spring Sleuth, Zipkin, and ELK Stack, focusing on how to propagate traceId and spanId, log them efficiently, search trace details in Kibana, and visualize request flows in Zipkin.

Table of Contents

  1. What is Distributed Tracing and Why Does It Matter?
  2. Using Spring Cloud Sleuth for TraceId Propagation
  3. Logging TraceId and SpanId in Spring Boot Logs
  4. Searching Logs in Kibana by TraceId
  5. Visualizing the Trace Flow of a Request in Zipkin
  6. Summary

What is Distributed Tracing and Why Does It Matter?

Distributed tracing enables tracking the path of a request as it travels through different microservices. It provides critical insights into how each service contributes to the overall response time, where failures occur, and which services may be bottlenecks.

Key Components in Distributed Tracing:

  1. TraceId: A unique identifier shared across all services in the trace. Used to group all related logs for a single request.
  2. SpanId: A unique subset of a traceId that identifies work done by a single service or task within the trace.
  3. Zipkin: A distributed tracing system used to visualize trace flows.
  4. Spring Cloud Sleuth: Automatically propagates tracing information (traceId and spanId) across microservices.
  5. ELK Stack: Centralizes logs in Elasticsearch for querying and analysis, with Kibana providing visualization.

Together, these tools turn an opaque system into something you can debug, monitor, and improve with clarity.


Using Spring Cloud Sleuth for TraceId Propagation

Spring Cloud Sleuth automatically generates traceId and spanId, ensuring consistent propagation of tracing information across microservices.

Step 1. Add Spring Sleuth to Your Project

To enable Sleuth in a Spring Boot application, add the following dependency:

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>

Key Features of Sleuth:

  • Assigns a traceId for an entire request flow.
  • Adds a spanId for individual service operations within the trace.
  • Propagates tracing data through HTTP headers (traceparent, X-B3-TraceId, etc.).

Example Configuration in application.properties:

spring.sleuth.enabled=true
spring.sleuth.sampler.probability=1.0
  • The sampler.probability=1.0 ensures all traces are logged during development.

Step 2. Trace Propagation in Multi-Service Requests

Spring Sleuth adds trace information automatically to outgoing HTTP requests. For example:

  • Service A makes a REST call to Service B:
    • Service A generates a traceId and spanId.
    • Headers like X-B3-TraceId and X-B3-SpanId are added to the HTTP request.
    • Service B logs and propagates the same traceId.

No additional code is required in most cases since Sleuth integrates seamlessly with Spring’s RestTemplate.

Using WebClient for Tracing:

If you’re using WebClient:

@Bean
public WebClient.Builder webClientBuilder() {
    return WebClient.builder();
}

Spring Sleuth automatically integrates with WebClient to propagate trace and span IDs.

By configuring Sleuth, we ensure consistent tracing data flows across services.


Logging TraceId and SpanId in Spring Boot Logs

To make tracing data actionable, include traceId and spanId in your logs.

Step 1. Use Spring Sleuth’s MDC Support

Spring Sleuth populates Mapped Diagnostic Context (MDC) with tracing properties.

Customize Logback Configuration:

Update logback-spring.xml to include traceId and spanId:

<configuration>
    <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <pattern>%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} - [%X{traceId}] [%X{spanId}] %msg%n</pattern>
        </encoder>
    </appender>
    <root level="INFO">
        <appender-ref ref="CONSOLE" />
    </root>
</configuration>

Example Log Output:

2025-06-13 10:30:15 INFO com.example.PaymentService - [123abc456def789] [456def] Payment processed successfully

Step 2. Log Correlation Across Services

By logging the same traceId across services, you can correlate logs from Service A, Service B, and Service C to reconstruct the entire flow of a request.

Additional Features:

  • Use Sleuth’s SpanCustomizer to add custom key-value pairs to spans for richer logging.
  • Example: @Autowired private Tracer tracer; tracer.currentSpan().tag("orderId", "12345");

This level of detail proves invaluable during debugging.


Searching Logs in Kibana by TraceId

Kibana, the visualization component of the ELK stack, makes searching and analyzing trace-specific logs seamless.

Step 1. Ensure Logs Flow to Elasticsearch

Using Logstash or Filebeat, ensure your Spring Boot logs stream into Elasticsearch with traceId and spanId included.

Example Logstash Filter:

Enrich logs with traceId if not already present:

filter {
  json {
    source => "message"
  }
  mutate {
    add_field => { "traceId" => "%{[traceId]}" }
  }
}

Step 2. Search Logs by TraceId in Kibana

  1. Open Kibana Discover.
  2. Use a query to filter logs by traceId: traceId:"123abc456def789"
  3. View logs from all microservices associated with the request.

Step 3. Save Search Queries

Save frequently used queries (e.g., service-specific error logs) for faster debugging.

With Kibana’s search capabilities, you can drill down into logs for specific traces, even in complex architectures.


Visualizing the Trace Flow of a Request in Zipkin

While Kibana helps analyze logs, Zipkin provides an intuitive visual representation of a request’s flow.

Step 1. Add Zipkin to the Setup

Include the dependency in your Spring Boot application:

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-sleuth-zipkin</artifactId>
</dependency>

Step 2. Configure Zipkin in application.properties:

Point your application to the Zipkin server:

spring.zipkin.base-url=http://localhost:9411

Step 3. Run the Zipkin Server

Use Docker to start Zipkin:

docker run -d -p 9411:9411 openzipkin/zipkin

Step 4. View Trace Flow

  1. Access the Zipkin UI at http://localhost:9411.
  2. Search for traces by traceId.
  3. Visualize request flows, including service dependencies, spans, and latencies.

Example Visualization:

  • Service A → Service B → Service C
  • See where latencies originate or which service failed within the trace.

By integrating Zipkin, you gain a deeper understanding of system bottlenecks and performance issues.


Summary

Distributed tracing with Spring Sleuth, Zipkin, and ELK offers unparalleled insights into microservices architectures. Here’s what we covered:

  1. Spring Cloud Sleuth: Provides seamless propagation of traceId and spanId.
  2. Logback Customization: Logs trace-related data for easier correlation in Kibana.
  3. Kibana Search: Enables querying logs by traceId to reconstruct request flows.
  4. Zipkin Visualization: Offers a graphical representation of trace flows for better debugging.

Implement these tools in your Spring Boot microservices today to take observability to the next level. With these insights, debugging becomes faster, bottlenecks clearer, and your systems more resilient.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *