07 / Data Engineering / Analytics Engineering / May 2026
StreamPulse
Azure lakehouse analytics platform for a subscription video streaming business with batch ingestion, medallion layers, data quality, Azure SQL serving, and Metabase dashboards.
01
Overview
StreamPulse is a batch-first Azure data engineering platform for a fictional subscription streaming business.
02
Challenge
The challenge was building a credible cloud data platform without depending on expensive always-on services.
03
Outcome
The result is a showcase-ready data engineering platform: local end-to-end runs, data quality summaries, Azure Data Factory validation, Azure SQL serving tables, compact senior analytics marts, Metabase dashboard evidence, and a static showcase dashboard generated from the same serving outputs.
Project background
Why this project exists
StreamPulse was built as a portfolio-grade Azure lakehouse for a fictional subscription video streaming business. The project is intentionally realistic: leadership needs trusted answers about growth, paid users, churn, revenue, engagement, content performance, payment quality, and trial-to-paid conversion.
The important design choice is that the platform is batch-first, not stream-first. It demonstrates source generation, landing files, Bronze/Silver/Gold layering, dimensional modeling, data quality, Azure SQL serving, and Metabase dashboard provisioning while staying cost-aware for Azure Student credits.
Build notes
How it was shaped
Generated realistic subscription, content, session, watch, payment, funnel, device, region, and support-event data for a streaming business.
Implemented a local lakehouse path from landing to Bronze, Silver, Gold, serving CSVs, SQLite proof, compact analytics marts, and Metabase extracts.
Designed the Azure path around ADF ingestion, ADLS Gen2 storage, Databricks/PySpark transformation, Azure SQL serving views, and Metabase dashboard provisioning.