Observability Benchmarking

A comprehensive framework for benchmarking containerized REST services under the Grafana LGTM observability stack

Project Overview

This project provides a production-ready Docker Compose environment for benchmarking REST service implementations while collecting comprehensive telemetry data including logs, metrics, traces, and CPU profiles.

High Performance

Benchmark results up to 83,000 RPS on CPU-limited containers (4 vCPUs)

Full Observability

Complete LGTM stack: Loki, Grafana, Tempo, Mimir with Pyroscope profiling

Multiple Frameworks

Spring Boot, Quarkus implementations with JVM and native compilation

Thread Models

Compare platform threads, virtual threads, and reactive programming

Reproducible

Deterministic load generation with wrk2 and containerized environment

Containerized

Complete Docker Compose orchestration for all services and tools

System Architecture

A modern, cloud-native architecture demonstrating industry best practices in observability and performance engineering.

Load Generation Layer

wrk2 Load Generator

Service Layer

Spring Boot Services
Quarkus Services
Go Services

Collection Layer

OpenTelemetry
Grafana Alloy
Pyroscope Agent

Storage & Analysis Layer

Loki (Logs)
Tempo (Traces)
Mimir (Metrics)
Pyroscope (Profiles)

Visualization Layer

Grafana Dashboards

Key Design Decisions

  • OpenTelemetry Integration: Standardized instrumentation across all services using OTLP over gRPC
  • Batched Telemetry: Optimized batch processing to minimize overhead on services under test
  • Resource Isolation: CPU-limited containers ensure fair comparison across implementations
  • Multi-dimensional Profiling: Combined eBPF and agent-based profiling for comprehensive insights
  • Deterministic Testing: wrk2 provides fixed-rate load generation for reproducible results

Benchmark Results

Performance comparison of different frameworks and concurrency models on CPU-limited containers (4 vCPUs)

Implementation Mode RPS (22/01/2026)
Quarkus JVM Reactive 104,000
Quarkus JVM Virtual 90,000
Quarkus JVM Platform 70,000
Quarkus Native Virtual 54,000
Go N/A 52,000
Quarkus Native Reactive 51,000
Quarkus Native Platform 45,000
Spring JVM Platform 32,000
Spring JVM Virtual 29,000
Spring JVM Reactive 22,000
Spring Native Virtual 20,000
Spring Native Platform 20,000
Spring Native Reactive 16,000

See Benchmarking Methodology for reproducibility details and interpretation.

Key Insights

Reactive Advantage

Quarkus reactive implementation shows exceptional throughput under fixed-rate load

Virtual Threads

Java virtual threads provide strong throughput with a simpler concurrency model

Native vs JVM

Native images offer faster startup; the JVM can deliver higher peak throughput

Instrumentation Matters

All results here assume a comparable observability pipeline (OTel + LGTM + profiling)

Resources