Projects Showcase Timeline


Anomaly Detection Pipeline

Built an end-to-end anomaly detection system for mental health behavior analysis using DVC for data and experiment versioning. Trained Isolation Forest, LOF, and One-Class SVM models for early depression detection and deployed real-time scoring through a FastAPI service. Automated CI/CD with GitHub Actions and deployed containerized components on Kubernetes for scalable and reliable operations.

GitHub

Football Analysis using YOLO & OpenCV

Built an end-to-end football analytics system using YOLO object detection and OpenCV to track players, detect ball position, estimate speed, and distance. Applied perspective correction and player clustering for team identification and performance metrics.

GitHub

Heart Attack Prediction Pipeline

Created a complete MLOps pipeline for heart attack prediction with robust preprocessing and feature engineering. Trained and benchmarked multiple ML and ensemble models using Scikit-learn, orchestrated workflows with ZenML, and logged experiments via MLflow. Deployed real-time inference with FastAPI and scaled the Dockerized pipeline on Kubernetes with automated CI/CD using GitHub Actions.

GitHub

ETL Pipeline Databricks

A modular ETL pipeline using PySpark that reads from JSON, Parquet, and Avro sources via the factory pattern. Business logic is applied before writing the transformed data to sinks like Parquet, DBFS, or other storage systems.

GitHub

Text Sentiment Analysis Pipeline

Developed a sentiment analysis pipeline for Yelp reviews with DVC-based data tracking and NLP preprocessing. Trained LightGBM, XGBoost, and BERT models for accurate sentiment scoring and exposed real-time prediction endpoints via FastAPI. Implemented CI/CD with GitHub Actions and deployed the Dockerized system on Kubernetes for scalable inference.

GitHub

Real Time Data Streaming

Built a Dockerized real-time data pipeline using Kafka, Spark, and Cassandra to ingest and process user logs. Implemented schema validation with Confluent Registry and enabled Kafka monitoring via Control Center.

GitHub