Project Details
Client:
Personal / Academic Project
Tool:
Azure Data Factory, Azure Blob Storage, Azure Databricks (PySpark), Azure Cosmos DB, Power BI
Smart City IoT Data Engineering Pipeline – Azure
End-to-End Azure Pipeline for Real-Time Smart City Insights
This project implements a complete Azure-based data pipeline that ingests, processes, and visualizes IoT sensor data to support real-time urban decision-making for Melbourne city planners. The pipeline handles ≈50MB of sensor data per hour, covering air quality, traffic density, and energy consumption.
The solution architecture includes Azure Data Factory for orchestrated ingestion, Blob Storage for raw and processed zones, and Azure Databricks (PySpark) for data cleaning, schema casting, and aggregations. Curated data is stored in Azure Cosmos DB as structured JSON documents, allowing for low-latency querying and live API access. Final insights are visualized in Power BI, enabling city teams to monitor pollution spikes, infrastructure load, and after-dark mobility trends.
In one example, the system detected a 15% increase in night-time foot traffic near Federation Square—driving immediate public lighting upgrades. The pipeline reduced reporting latency from 30 days to less than 24 hours and showcases layered competency in Azure services, real-time data flow, and scalable analytics engineering.
Explore the project in full: https://github.com/dangquii/smart-city-data-engineering