I am currently a Graduate Research Assistant (GRA) at Kennesaw State University, working under the supervision of Dr. Honghui Xu . My research centers on advancing Large Language Models (LLMs), Multimodal AI, and privacy-preserving machine learning. I develop enhanced multimodal LLMs through LoRA fine-tuning and Dual-Differential Privacy, improving their robustness and reducing harmful or inaccurate generations. I also design lightweight YOLO-based UAV vision models optimized for efficient inference on edge hardware.
Prior to KSU, I worked as a Junior Research Engineer at the DIU NLP & ML Research Lab in Bangladesh, where I built deep learning models for MRI brain tumor classification and segmentation, contributing to improved diagnostic support systems. I also applied transfer learning to rare-bird species detection, supporting biodiversity research through scalable and reliable computer vision solutions.
Earlier, I served as a Software Developer at Masleap Plc., developing complete full-stack applications using ReactJS, Django, and MongoDB. I integrated Intel OpenVINO for real-time video and image analytics, enhancing system responsiveness and improving data-driven decision capabilities for business operations.
My journey into AI began as a Machine Learning Engineer Intern at Nazpev Inc. in Japan, where I worked on time-series forecasting for retail sales data to support more accurate and reliable demand planning workflows.
Across all my roles, I have enjoyed working at the intersection of AI research, engineering, and real-world impact. I am passionate about building intelligent systems that are technically robust, privacy-aware, and aligned with human needs. I continually strive to bring meaningful, ethical, and scalable AI solutions to life.
Developed a YOLO-based detection framework for UAV-based post-disaster building damage assessment, integrating DP-SGD and structured pruning to jointly optimize accuracy, privacy, and deployment efficiency on edge devices.
Trained DeepLabV3+ (ResNet-101) for road-scene segmentation across fog, night, rain, and snow conditions, using weather-aware augmentations and BN adaptation to improve robustness and stability for autonomous driving perception.
Built a multimodal Retrieval-Augmented Generation pipeline that combines chest X-rays and clinical reports using Qwen2-VL, CLIP, BM25, FAISS, and a cross-encoder reranker, with a FastAPI backend and Gradio UI for interactive medical question answering.
Implemented Dual-Differential Privacy with two-stage noise injection (embedding-level and LoRA parameter-level) for multimodal LLMs, evaluating accuracy, hallucination rate, and privacy budget to study privacy–utility trade-offs in image–text alignment.
Developed a lightweight, privacy-preserving multimodal framework that fuses UAV imagery and social-media text for disaster event classification. Combines ResNet50 visual features and BiLSTM + GloVe text embeddings in a late-fusion architecture, with DP-SGD for ε-differential privacy and structured neuron pruning for efficient edge deployment on UAVs and field devices.
Benchmarked ARIMA, Prophet, LSTM, and XGBoost on the UCI Online Retail II dataset to build a robust revenue forecasting pipeline, and designed a pricing optimization engine with demand elasticity modeling to support data-driven pricing and retention strategies.