DISTRIBUTED SYSTEMS

Metrics Logging & Aggregation System

A production-grade distributed pipeline built on Apache Kafka, Apache Flink, and Spring Boot. Ingests structured (JSON/Protobuf/Avro) and unstructured (raw logs) data at high volume, processes streams in real time, and lands analytical output as Parquet files in object storage.

6
Maven Modules
4
Kafka Topics
4
Flink Pipelines
5
Storage Systems
Open Interactive Learning Guide
Java 17Language
Spring Boot 3.2.5Framework
Apache Kafka 7.5.0Message Broker
Apache Flink 1.18.1Stream Processor
Apache Spark 3.5.1Batch Processor
TimescaleDB pg16Time-Series DB
PostgreSQL 16Operational DB
MinIO (S3)Object Storage
Hadoop HDFS 3.2.1Distributed FS
Debezium 2.5CDC Connector
Protobuf 3.25.1Binary Encoding
Apache Avro 1.11.3Schema Format
Apache Parquet 1.13.1Columnar Storage
Confluent SR 7.5.0Schema Registry
Caffeine 3.1.8Caching
Flyway 10DB Migrations