1 | The Problem
Static cloud infrastructure leads to a choice between high costs (over-provisioning) or high latency (under-provisioning). This project implements Infrastructure Elasticity, allowing a cluster to grow and shrink its compute resources based on live data pressure.
02 | System Architecture
W2-W4 executors scale up when a 1s backlog is detected in the 'cs_student_logs' topic
03 | Technology Stack
| Category | Technology | Role |
|---|---|---|
| Ingestion | Apache Kafka | High-throughput message buffer |
| Processing | Spark 3.5.0 | 10s Tumbling Window Aggregation |
| Elasticity | Dynamic Allocation | Real-time Executor scaling |
| Environment | Docker , Linux | Containerized cluster management |
04 | Performance Impact
Testing confirmed a reduction from 96 to 36 instance-hours/day compared to static cluster strategies while maintaining sub-second latency.