Projects
Predictive Maintenance Data Pipeline
UC Berkeley, 2026
Built a production-style data pipeline to process turbofan engine sensor data and surface fleet health insights for predictive maintenance.
- Built an end-to-end data engineering pipeline on Databricks using Medallion Architecture (Bronze, Silver, Gold) to process industrial sensor time-series data
- Ingested and transformed 160K+ rows of raw sensor data using PySpark, engineering 50+ features including rolling statistics, lag features, and degradation indicators
- Delivered a Lakeview BI dashboard showing fleet health trends, degradation patterns, and fault mode analysis
- Used Unity Catalog for data governance and schema enforcement across all pipeline layers
- Dataset: NASA CMAPSS Turbofan Engine Degradation (4 sub-datasets, 21 sensors per engine)
Databricks
PySpark
Delta Lake
Unity Catalog
Lakeview BI
Medallion Architecture
View on GitHub
Turbofan Engine RUL Prediction
UC Berkeley, 2026
R²=0.82 | RMSE=17.90 | Critical-zone F1=0.89
ML pipeline built on top of the data engineering project, predicting when turbofan engines will fail using sensor readings and engineered features.
- Built an ML pipeline on top of the data engineering Silver layer for predicting engine Remaining Useful Life
- Engineered 70+ features from 21 sensors: rolling statistics, lag features, rate-of-change, and operational indicators
- Trained XGBoost regression and classification models with hyperparameter tuning and cross-validation
- Tracked all experiments in MLflow, logging parameters, metrics, and artifacts; registered best models in Model Registry
- RUL capped at 125 cycles (standard in CMAPSS literature) to focus the model on the critical degradation window
XGBoost
MLflow
Databricks
scikit-learn
Feature Engineering
Pandas
View on GitHub
Neural Networks for Energy-Optimized Drone Fleet Coordination
HiPeRLab, UC Berkeley, 2026 (Capstone)
Best test R²=0.676 | MAE=32.8W (GRU on 209 outdoor flights)
Capstone research project at UC Berkeley's HiPeR Lab investigating which neural network architectures best predict quadrotor energy consumption from real flight telemetry.
- Built an end-to-end deep learning pipeline in PyTorch benchmarking 7 architectures (Linear Reg, MLP, RNN, GRU, LSTM, Bi-LSTM, Transformer) for real-time quadrotor power prediction
- Processed 240,000+ real-world flight sequences from two datasets: HiPeRLab rosbag data and CMU Package Delivery Drone dataset (209 flights, 5Hz, varying wind and payload conditions)
- Identified and fixed a critical data leakage bug in the validation split that was causing all models to collapse to predicting the training mean; implemented per-source stratified splitting
- Performed transformer self-attention analysis revealing that motor power is determined by the last 150 to 400ms of flight state, which is a physically interpretable finding
- Used this insight to engineer rolling turbulence features, improving non-sequential model performance by up to 1.7% R²
- GRU model was adopted as the energy prediction component for the team's fleet coordination system
PyTorch
RNN
GRU
LSTM
Transformer
NumPy
Pandas
SafetyIQ: Industrial Safety Documentation Assistant
UC Berkeley, 2026
A RAG application that lets engineers query industrial safety and equipment documentation using plain English, returning precise answers with source document and page number citations. Built for a safety-critical domain where hallucinated answers are unacceptable.
- Built a Retrieval-Augmented Generation pipeline using LangChain that ingests, chunks, embeds, and retrieves from a corpus of OSHA standards, API recommended practices, and equipment manuals
- Implemented MMR-based retrieval to balance relevance and diversity across a multi-document vector store, preventing redundant retrieval from single sections
- Enforced zero-hallucination generation with constrained prompting and explicit refusal behavior, ensuring the system never fabricates safety or compliance information
- Supports single-document queries, cross-document retrieval (pulling from multiple sources in one answer), and correct refusal when information is not in the corpus
- Built a Streamlit web interface with source attribution cards showing document name and page number for every retrieved chunk
LangChain
OpenAI
ChromaDB
RAG
Streamlit
PyMuPDF
Python
View on GitHub
Multi-Agent Cooperative Defense using Deep Reinforcement Learning
UC Berkeley, 2026
2.4x survival improvement | 15x threat neutralization over baselines
Built a full simulation and RL training pipeline to study how 20 autonomous agents can coordinate to defend a target, comparing learned strategies against classical approaches.
- Designed a custom multi-agent simulation environment from scratch in Python with 20 autonomous agents, collision avoidance, and configurable threat dynamics
- Implemented a Deep Q-Network (DQN) agent in PyTorch with CNN architecture, experience replay buffer (200K capacity), target network stabilization, and epsilon-greedy exploration with tuned decay schedules
- Trained using centralized training with decentralized execution (CTDE): 20 agents sharing a single CNN but acting independently on local 21x21 observations
- Built two rule-based baselines: Voronoi tessellation (SciPy) for geometric territory partitioning and Hungarian algorithm for globally optimal agent-threat matching
- Discovered emergent defender-interceptor role specialization arising from reward shaping and physical constraints, without any explicit role assignment
- Built a unified evaluation framework with fixed-seed reproducibility and real-time Pygame visualization
PyTorch
DQN
SciPy
Pygame
NumPy
Multi-Agent RL
Autonomous Surface Vessel: Waypoint Tracking and Control
IIT Madras, 2024-2025
Developed a complete autonomy stack for a surface vessel covering dynamics, estimation, control, and guidance, and deployed it on physical hardware.
- Deployed and validated a full ROS-based autonomy stack on a physical autonomous surface vessel
- Implemented 6-DoF vessel maneuvering dynamics and an Extended Kalman Filter (EKF) for state estimation
- Developed PID heading control and waypoint tracking using an Integral Line-of-Sight guidance algorithm
- Tested and debugged the full system on a physical ASV testbed, dealing with real sensor noise and environmental disturbances
ROS
Python
MATLAB
EKF
PID
Controls
Hydrofoil Vessel: Design, Control, and Manufacturing
IIT Madras, 2024-2025 | Published at ASME OMAE 2025, Vancouver
89% drag reduction (7.47kN to 812N) | Stable at 19.4 knots
Led a team through the full lifecycle of a hydrofoil vessel, from foil design and active ride control through manufacturing and physical testing in a towing tank. Published at ASME OMAE 2025.
- Led a team through the full lifecycle: foil geometry optimization, 3-DoF dynamics modeling, LQR controller design, physical manufacturing, and towing tank testing
- Optimized NACA 2415 foil profile for target speed and displacement using MATLAB and low-order flow simulation
- Designed an LQR controller that modulates foil angle of attack in real time to maintain stable ride height with near-zero pitch
- Took the project from equations on paper to a physical vessel tested in water
MATLAB
LQR
Dynamics Modeling
Hydrodynamics
Manufacturing
AURA-SYNTH: Real-Time Gestural Audio Engine
UC Berkeley, 2026
Designed the complete audio processing software for a self-contained gestural music instrument that translates hand movements into real-time sound synthesis and sample playback, running entirely on an ESP32 with no external computer.
- Designed and built the complete audio processing software for a gesture-controlled musical instrument running entirely on an ESP32 microcontroller with no external computer
- Architected a dual-core real-time audio engine processing at 44.1kHz stereo using FreeRTOS, with zero-allocation memory management via a pre-allocated block pool
- Implemented a multi-stage DSP pipeline: wavetable oscillators (sine + detuned sawtooth), biquad IIR low-pass filters, LFO tremolo modulation, pitch bend, and PSRAM-backed delay effects
- Built a polyphonic sample playback engine reading flash-embedded WAV files, supporting 8 simultaneous voices with oldest-voice stealing and pitch shifting across 8 time-of-flight sensor zones
- Designed a multi-track gesture loop recorder storing compact control frames at 86Hz in PSRAM, with record/overdub/playback state machine for live layering
- Defined the cross-core communication protocol (mutex-protected shared state struct) enabling independent development across a 4-person team: Core 0 for sensing, Core 1 for audio
C
ESP-IDF
FreeRTOS
I2S/DMA
PSRAM
DSP
ESP32