cv
Research Objective
| Objective | A highly motivated researcher with extensive experience in Natural Language Processing, Generative AI, and Deep Learning, evidenced by multiple peer-reviewed publications. Seeking to pursue a PhD to develop novel multi-modal models and explore their applications in complex reasoning and misinformation detection. |
Education
-
2018 - 2023 B.Sc. in Computer Science and Engineering
Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh - CGPA: 3.27 / 4.00
- Relevant Coursework: Linear Algebra, Data Structures and Algorithms, Object Oriented Programming, Discrete Mathematics, Database Management, Applied Statistics & Queuing Theory, Digital Image Processing, Neural Network and Fuzzy System, Artificial Intelligence, Data Mining
Research Experience
-
2024 - Present Research Member
Young Learners' Research Lab (YLRL), Rajshahi, Bangladesh - Conducted interdisciplinary research in NLP, Generative AI, and misinformation detection, resulting in five peer-reviewed publications.
- Collaborated on projects focusing on low-resource NLP for the Bengali language, analysis of political discourse, and classification of AI-generated text.
- Engaged in a collaborative environment to explore cutting-edge ML/DL applications, aligning with the lab's mission to foster impactful research.
Publications
-
Journal Articles
- Debanath, Koshik, Aich, Sagor, and Srizon, Azmain Yakin. "Bayesian Physics-Informed Neural Networks for Parameter Inference and Uncertainty Quantification in Reaction-Diffusion Models of Wound Healing." Under review, Mathematical Biosciences (July 2025). Preprint available at SSRN or DOI.
-
Conference Papers
- K. Debanath, A. F. M. M. Rahman and M. A. Hossain, "An Attention-Based Deep Learning Approach to Knee Injury Classification from MRI Images," 2023 26th International Conference on Computer and Information Technology (ICCIT), Cox's Bazar, Bangladesh, 2023, pp. 1-6, doi: 10.1109/ICCIT60459.2023.10441340.
- K. Debanath, S. Aich and A. Y. Srizon, "Advancing Low-Resource NLP: Contextual Question Answering for Bengali Language Using Llama," 2025 International Conference on Electrical, Computer and Communication Engineering (ECCE), Chittagong, Bangladesh, 2025, pp. 1-6, doi: 10.1109/ECCE64574.2025.11013841.
- S. Aich, K. Debanath and A. Y. Srizon, "Distinguishing Between Formal and Colloquial: A Multilingual BERT Approach to Bengali Language Classification," 2025 International Conference on Electrical, Computer and Communication Engineering (ECCE), Chittagong, Bangladesh, 2025, pp. 1-6, doi: 10.1109/ECCE64574.2025.11013999.
- K. Debanath, S. Aich and A. Y. Srizon, "Analyzing Bot Activity and Political Discourse in the 2024 U.S. Presidential Election: A Machine Learning Approach to Misinformation and Manipulation," 2nd International Conference on Next-Generation Computing, IoT and Machine Learning (NCIM-2025), Gazipur, Bangladesh, 2025, pp. 1-6, doi: 10.1109/NCIM65934.2025.11160229.
- S. Aich, K. Debanath, and A. Y. Srizon, "Distinguishing Human-Written and AI-Generated Text: A Comprehensive Study Using Explainable Artificial Intelligence in Text Classification," 2nd International Conference on Next-Generation Computing, IoT and Machine Learning (NCIM-2025), Gazipur, Bangladesh, 2025, pp. 1-6, doi: 10.1109/NCIM65934.2025.11160309.
- K. Debanath, "Physics-Informed Neural Networks for Real-Time Anomaly Detection in Power System Dynamics," Accepted, To appear in 3rd International Conference on Big Data, IoT and Machine Learning (BIM 2025).
Experience
-
Apr 2025 - Present Software Engineer I
Universal Machine Inc., Sunnyvale, CA, USA (Remote) - YouTube Live Stream Bot:
- Developed Chrome Extension automating YouTube Live chat using JavaScript, Chrome APIs, and async requests.
- Integrated YouTube & OpenAI APIs for real-time chat fetching/posting and AI response generation.
- Engineered AI features managing conversational history (chrome.storage) and prompt engineering for context/recall.
- Implemented secure Google OAuth (chrome.identity) and robust error handling for external APIs.
- cBORG DAO Governance Platform:
- Built a full-stack decentralized governance platform using React/Next.js, FastAPI, PostgreSQL, and Ethereum smart contracts for community proposal voting and treasury management.
- Integrated OpenAI GPT-4o to automatically parse natural language chat messages into structured trading proposals (buy/sell/hold) with confidence scoring and real-time voting.
- Implemented SIWE (Sign-In With Ethereum) wallet linking with nonce-based authentication, JWT tokens, and privacy-preserving user identity management.
- Developed live chat with proposal detection, voting dashboards, and mobile-responsive UI using Socket.io, Tailwind CSS, and modern React patterns.
- Created Solidity smart contracts for automated proposal execution and member verification, deployed on Ethereum testnet with Hardhat development framework.
- Implemented rate limiting, CORS protection, encrypted sessions with Redis, and comprehensive authentication flows for secure Web3 application deployment.
- YouTube Live Stream Bot:
-
Mar 2023 - Apr 2025 Data Scientist
Manaknightdigital Inc., Toronto, ON, Canada (Remote) - Chatbot Development:
- Collected and processed product information using Excel, pandas, and openpyxl.
- Integrated GPT-4 to respond to user queries and manage token size limitations.
- Utilized libraries like nltk, sklearn, and Flask for deploying the chatbot.
- Fraud Detection System:
- Performed EDA and feature extraction on transaction datasets.
- Developed and optimized ML models including Xgboost, SVC, and Logistic Regression.
- Achieved 90% accuracy in detecting fraudulent transactions and deployed the system using Flask.
- Data-driven ChatBot for Financial Queries:
- Implemented RAG and Pinecone, enhancing data retrieval speed by 40%, enabling faster decision-making for lenders.
- Improved data retrieval accuracy by 25% using Cohere reranking, resulting in more precise financial advice.
- Applied Beautiful Soup and PyPDF2 for data scraping and processing.
- Sports Data Analysis ChatBot:
- Scraped and analyzed football data to predict match outcomes.
- Integrated RAG and Pinecone for efficient data querying and vector database management.
- Employed Beautiful Soup and PyPDF2 for data collection, analyzing 2 million football data points to achieve a 90% prediction accuracy, supporting strategic betting decisions.
- Custom Image Generation System:
- Developed an image generation platform using Stable Diffusion.
- Fine-tuned custom models to generate images based on user-defined presets.
- Utilized PyTorch and transformers for model training and deployment and finally used Docker for containerization.
- AI-driven Data Matching System:
- Organizational data was segmented using models such as Llama-2-7B and then fine-tuned to extract sections and subsections.
- Applied cosine similarity for matching data to specific tenders.
- Integrated GPT-4 for generating rationale from corresponding data.
- Matched organizational data against specific tenders, increasing successful tender submissions by 70%.
- AI-Powered Collectible Authentication & Appraisal Platform:
- Trained deep learning models (PyTorch/TensorFlow, e.g., InceptionV3, ResNet50, CLIP) for image classification (authenticity) and similarity search.
- Engineered an efficient CLIP+FAISS image similarity system for large-scale appraisal lookups.
- Developed Flask/FastAPI APIs to serve model predictions (classification, similarity, appraisal).
- Designed a multi-modal tag identification system using Serverless (RunPod API), TF-IDF, and CLIP/FAISS similarity.
- Implemented asynchronous data pipelines (aiohttp, asyncio, pandas) for large-scale image and metadata ingestion from APIs.
- Developed a Streamlit web application for user image uploads and displaying similarity/appraisal results via API calls.
- Chatbot Development:
Projects
-
2024 AI Investment Committee for Binance
- Designed a multi-agent AI system with specialized research, trading, and risk agents that synthesize market data into actionable cryptocurrency recommendations.
- Tech Stack: Python, OpenAI/Gemini API, Binance API, Streamlit, Pydantic.
-
2023 Stock Price Forecasting Dashboards
- Engineered LSTM pipelines to forecast Bangladeshi and global equities, packaging the workflows into interactive Streamlit dashboards for retail investors.
- Tech Stack: Python, TensorFlow, Keras, LSTM, Pandas, Plotly, Streamlit, bdshare.
-
2024 AI vs Human Generated Text Detector
- Built a web application that classifies whether text is human-written or AI-generated using Machine Hack’s LLM Hackathon dataset.
- Trained and tuned an SVC model after extensive preprocessing, feature engineering, and explainability analysis.
- Tech Stack: Flask, Scikit-learn, Python, NumPy, Pandas, Matplotlib.
-
2024 DataSciencePilot (RAG System)
- Built a chat-based interface that lets users query custom PDFs via Pinecone vector search paired with LLaMA-2 generation.
- Tech Stack: LangChain, Transformers, LLaMA-2, Pinecone, Python.
-
2024 CVAnalyzerPro
- Developed an AI assistant that benchmarks candidate CVs against job descriptions, surfaced alignment scores, and highlighted gaps automatically.
- Tech Stack: OpenAI API, Gemini API, Streamlit.
-
2023 UberRidePrediction
- Packaged an XGBoost fare estimation pipeline as a reusable Python library and deployed inference APIs with FastAPI for real-time fare predictions.
- Tech Stack: Scikit-learn, XGBoost, CI/CD, FastAPI, Render.
-
2024 Pinecone Integration Suite
- Authored PineconeUtils and PineconePDFExtractor libraries that simplify ingestion, chunking, and indexing workflows for Retrieval-Augmented Generation pipelines.
- Tech Stack: Pinecone, Cohere, OpenAI, PyPDF2.
-
2024 CaptionCraft
- Created a Streamlit experience for generating rich image captions using Google Gemini Pro Vision, enabling content teams to draft posts rapidly.
- Tech Stack: Gemini, Streamlit, Python.
-
2023 Market Price Prediction Suite
- Implemented and compared multiple time-series models (ARIMA, SARIMAX, LSTM, GRU, XGBoost, Prophet) to forecast commodity price trends.
- Tech Stack: Python, TensorFlow, Keras, XGBoost, Prophet.
-
2023 Movie Recommendation Engine
- Built a cosine-similarity KNN recommender that suggests films based on user-selected favorites with a lightweight Flask front end.
- Tech Stack: Scikit-learn, Pandas, Flask, SciPy.
-
2022 Potato Disease Classification
- Developed a convolutional neural network that achieved near-100% accuracy diagnosing potato leaf diseases from images.
- Tech Stack: TensorFlow, Keras, CNN.
-
2022 Diabetes Prediction (PyTorch ANN)
- Built and deployed a PyTorch-based neural network that predicts diabetes risk using clinical features and serves results via Flask + Gunicorn.
- Tech Stack: PyTorch, Flask, Gunicorn, Pandas.
Skills
-
Languages
- Python (Expert), C/C++, Java, JavaScript, SQL, MATLAB
-
AI/ML Frameworks
- PyTorch, TensorFlow, Keras, Scikit-learn, LangChain, Transformers, OpenCV
-
AI/ML Expertise
- Generative AI (LLMs, RAG, Fine-tuning), NLP, Computer Vision, Deep Learning, Time Series Analysis, Prompt Engineering, Explainable AI (XAI), Data Mining
-
Tools & Platforms
- Git, Docker, FastAPI, Flask, Django, CI/CD, MLOps, Pinecone, MongoDB, MySQL, SQLite
Competitions & Achievements
- Hackathon Champion at Machine Hack: Global Ranking 539 out of 8,861.
- Data Science Student Championship: Secured 7th position among 1,029 participants.
- LLM Hackathon (Decoding Discourse - AI vs Human): Ranked 5th out of 227 participants.
- Rental Bikes Volume Prediction Hackathon: Ranked 3rd.
- News Category Prediction Hackathon: Ranked 7th.
- Predicting House Prices in Bengaluru: Ranked 24th out of 2,885 participants with 87% accuracy.
- Subscriber Prediction Talent Search Hackathon: Ranked 26th out of 5,045 participants.
- Analytics Olympiad 2022: Ranked 82nd out of 1,029 participants.
- Data Science Student Championship - South Zone: Ranked 73rd out of 554 participants.
- Decoding Discourse - AI vs Human: Ranked 5th out of 293 participants.
Open Source Contributions
-
2024 OpenLLMetry PR
- Resolved a bug where Python data classes passed as parameters were not being serialized and logged in workflows and tasks.
- Implemented proper serialization support for dataclasses, ensuring they are correctly captured as inputs and outputs in observability logs.
- Added automated tests to verify serialization behavior and prevent regressions.
-
2024 OpenLLMetry PR
- Contributed to OpenLLMetry by fixing a TypeError in the OpenAI embeddings metrics handler caused by comparisons between NoneType and integers; implemented proper handling of None values with error logging.
- Added automated tests to validate the fix and ensure the robustness of embeddings metrics processing.
- Improved overall stability by preventing this error from impacting workflow execution.
-
2024 Pinecone Canopy Contribution
- Contributed to Pinecone Canopy, a Retrieval-Augmented Generation (RAG) framework.
Certifications & Professional Development
-
2024 Understanding and Applying Text Embeddings
DeepLearning.AI - A comprehensive short course on the end-to-end development of applications using text embeddings. Key topics included:
-
- Fundamentals of creating, understanding, and visualizing embedding spaces.
- Leveraging embeddings for practical applications like semantic search and retrieval.
- Building a complete Q&A system (Retrieval-Augmented Generation) using Google's Vertex AI.