cv

Research Objective

Objective A highly motivated researcher with extensive experience in Natural Language Processing, Generative AI, and Deep Learning, evidenced by multiple peer-reviewed publications. Seeking to pursue a PhD to develop novel multi-modal models and explore their applications in complex reasoning and misinformation detection.

Education

  • 2018 - 2023
    B.Sc. in Computer Science and Engineering
    Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
    • CGPA: 3.27 / 4.00
    • Relevant Coursework: Linear Algebra, Data Structures and Algorithms, Object Oriented Programming, Discrete Mathematics, Database Management, Applied Statistics & Queuing Theory, Digital Image Processing, Neural Network and Fuzzy System, Artificial Intelligence, Data Mining

Research Experience

  • 2024 - Present
    Research Member
    Young Learners' Research Lab (YLRL), Rajshahi, Bangladesh
    • Conducted interdisciplinary research in NLP, Generative AI, and misinformation detection, resulting in five peer-reviewed publications.
    • Collaborated on projects focusing on low-resource NLP for the Bengali language, analysis of political discourse, and classification of AI-generated text.
    • Engaged in a collaborative environment to explore cutting-edge ML/DL applications, aligning with the lab's mission to foster impactful research.

Publications

Experience

  • Apr 2025 - Present
    Software Engineer I
    Universal Machine Inc., Sunnyvale, CA, USA (Remote)
    • YouTube Live Stream Bot:
      • Developed Chrome Extension automating YouTube Live chat using JavaScript, Chrome APIs, and async requests.
      • Integrated YouTube & OpenAI APIs for real-time chat fetching/posting and AI response generation.
      • Engineered AI features managing conversational history (chrome.storage) and prompt engineering for context/recall.
      • Implemented secure Google OAuth (chrome.identity) and robust error handling for external APIs.
    • cBORG DAO Governance Platform:
      • Built a full-stack decentralized governance platform using React/Next.js, FastAPI, PostgreSQL, and Ethereum smart contracts for community proposal voting and treasury management.
      • Integrated OpenAI GPT-4o to automatically parse natural language chat messages into structured trading proposals (buy/sell/hold) with confidence scoring and real-time voting.
      • Implemented SIWE (Sign-In With Ethereum) wallet linking with nonce-based authentication, JWT tokens, and privacy-preserving user identity management.
      • Developed live chat with proposal detection, voting dashboards, and mobile-responsive UI using Socket.io, Tailwind CSS, and modern React patterns.
      • Created Solidity smart contracts for automated proposal execution and member verification, deployed on Ethereum testnet with Hardhat development framework.
      • Implemented rate limiting, CORS protection, encrypted sessions with Redis, and comprehensive authentication flows for secure Web3 application deployment.
  • Mar 2023 - Apr 2025
    Data Scientist
    Manaknightdigital Inc., Toronto, ON, Canada (Remote)
    • Chatbot Development:
      • Collected and processed product information using Excel, pandas, and openpyxl.
      • Integrated GPT-4 to respond to user queries and manage token size limitations.
      • Utilized libraries like nltk, sklearn, and Flask for deploying the chatbot.
    • Fraud Detection System:
      • Performed EDA and feature extraction on transaction datasets.
      • Developed and optimized ML models including Xgboost, SVC, and Logistic Regression.
      • Achieved 90% accuracy in detecting fraudulent transactions and deployed the system using Flask.
    • Data-driven ChatBot for Financial Queries:
      • Implemented RAG and Pinecone, enhancing data retrieval speed by 40%, enabling faster decision-making for lenders.
      • Improved data retrieval accuracy by 25% using Cohere reranking, resulting in more precise financial advice.
      • Applied Beautiful Soup and PyPDF2 for data scraping and processing.
    • Sports Data Analysis ChatBot:
      • Scraped and analyzed football data to predict match outcomes.
      • Integrated RAG and Pinecone for efficient data querying and vector database management.
      • Employed Beautiful Soup and PyPDF2 for data collection, analyzing 2 million football data points to achieve a 90% prediction accuracy, supporting strategic betting decisions.
    • Custom Image Generation System:
      • Developed an image generation platform using Stable Diffusion.
      • Fine-tuned custom models to generate images based on user-defined presets.
      • Utilized PyTorch and transformers for model training and deployment and finally used Docker for containerization.
    • AI-driven Data Matching System:
      • Organizational data was segmented using models such as Llama-2-7B and then fine-tuned to extract sections and subsections.
      • Applied cosine similarity for matching data to specific tenders.
      • Integrated GPT-4 for generating rationale from corresponding data.
      • Matched organizational data against specific tenders, increasing successful tender submissions by 70%.
    • AI-Powered Collectible Authentication & Appraisal Platform:
      • Trained deep learning models (PyTorch/TensorFlow, e.g., InceptionV3, ResNet50, CLIP) for image classification (authenticity) and similarity search.
      • Engineered an efficient CLIP+FAISS image similarity system for large-scale appraisal lookups.
      • Developed Flask/FastAPI APIs to serve model predictions (classification, similarity, appraisal).
      • Designed a multi-modal tag identification system using Serverless (RunPod API), TF-IDF, and CLIP/FAISS similarity.
      • Implemented asynchronous data pipelines (aiohttp, asyncio, pandas) for large-scale image and metadata ingestion from APIs.
      • Developed a Streamlit web application for user image uploads and displaying similarity/appraisal results via API calls.

Projects

  • 2024
    AI Investment Committee for Binance
    • Designed a multi-agent AI system with specialized research, trading, and risk agents that synthesize market data into actionable cryptocurrency recommendations.
    • Tech Stack: Python, OpenAI/Gemini API, Binance API, Streamlit, Pydantic.
  • 2023
    Stock Price Forecasting Dashboards
    • Engineered LSTM pipelines to forecast Bangladeshi and global equities, packaging the workflows into interactive Streamlit dashboards for retail investors.
    • Tech Stack: Python, TensorFlow, Keras, LSTM, Pandas, Plotly, Streamlit, bdshare.
  • 2024
    AI vs Human Generated Text Detector
    • Built a web application that classifies whether text is human-written or AI-generated using Machine Hack’s LLM Hackathon dataset.
    • Trained and tuned an SVC model after extensive preprocessing, feature engineering, and explainability analysis.
    • Tech Stack: Flask, Scikit-learn, Python, NumPy, Pandas, Matplotlib.
  • 2024
    DataSciencePilot (RAG System)
    • Built a chat-based interface that lets users query custom PDFs via Pinecone vector search paired with LLaMA-2 generation.
    • Tech Stack: LangChain, Transformers, LLaMA-2, Pinecone, Python.
  • 2024
    CVAnalyzerPro
    • Developed an AI assistant that benchmarks candidate CVs against job descriptions, surfaced alignment scores, and highlighted gaps automatically.
    • Tech Stack: OpenAI API, Gemini API, Streamlit.
  • 2023
    UberRidePrediction
    • Packaged an XGBoost fare estimation pipeline as a reusable Python library and deployed inference APIs with FastAPI for real-time fare predictions.
    • Tech Stack: Scikit-learn, XGBoost, CI/CD, FastAPI, Render.
  • 2024
    Pinecone Integration Suite
    • Authored PineconeUtils and PineconePDFExtractor libraries that simplify ingestion, chunking, and indexing workflows for Retrieval-Augmented Generation pipelines.
    • Tech Stack: Pinecone, Cohere, OpenAI, PyPDF2.
  • 2024
    CaptionCraft
    • Created a Streamlit experience for generating rich image captions using Google Gemini Pro Vision, enabling content teams to draft posts rapidly.
    • Tech Stack: Gemini, Streamlit, Python.
  • 2023
    Market Price Prediction Suite
    • Implemented and compared multiple time-series models (ARIMA, SARIMAX, LSTM, GRU, XGBoost, Prophet) to forecast commodity price trends.
    • Tech Stack: Python, TensorFlow, Keras, XGBoost, Prophet.
  • 2023
    Movie Recommendation Engine
    • Built a cosine-similarity KNN recommender that suggests films based on user-selected favorites with a lightweight Flask front end.
    • Tech Stack: Scikit-learn, Pandas, Flask, SciPy.
  • 2022
    Potato Disease Classification
    • Developed a convolutional neural network that achieved near-100% accuracy diagnosing potato leaf diseases from images.
    • Tech Stack: TensorFlow, Keras, CNN.
  • 2022
    Diabetes Prediction (PyTorch ANN)
    • Built and deployed a PyTorch-based neural network that predicts diabetes risk using clinical features and serves results via Flask + Gunicorn.
    • Tech Stack: PyTorch, Flask, Gunicorn, Pandas.

Skills

  • Languages
    • Python (Expert), C/C++, Java, JavaScript, SQL, MATLAB
  • AI/ML Frameworks
    • PyTorch, TensorFlow, Keras, Scikit-learn, LangChain, Transformers, OpenCV
  • AI/ML Expertise
    • Generative AI (LLMs, RAG, Fine-tuning), NLP, Computer Vision, Deep Learning, Time Series Analysis, Prompt Engineering, Explainable AI (XAI), Data Mining
  • Tools & Platforms
    • Git, Docker, FastAPI, Flask, Django, CI/CD, MLOps, Pinecone, MongoDB, MySQL, SQLite

Competitions & Achievements

  • Hackathon Champion at Machine Hack: Global Ranking 539 out of 8,861.
  • Data Science Student Championship: Secured 7th position among 1,029 participants.
  • LLM Hackathon (Decoding Discourse - AI vs Human): Ranked 5th out of 227 participants.
  • Rental Bikes Volume Prediction Hackathon: Ranked 3rd.
  • News Category Prediction Hackathon: Ranked 7th.
  • Predicting House Prices in Bengaluru: Ranked 24th out of 2,885 participants with 87% accuracy.
  • Subscriber Prediction Talent Search Hackathon: Ranked 26th out of 5,045 participants.
  • Analytics Olympiad 2022: Ranked 82nd out of 1,029 participants.
  • Data Science Student Championship - South Zone: Ranked 73rd out of 554 participants.
  • Decoding Discourse - AI vs Human: Ranked 5th out of 293 participants.

Open Source Contributions

  • 2024
    OpenLLMetry PR
    • Resolved a bug where Python data classes passed as parameters were not being serialized and logged in workflows and tasks.
    • Implemented proper serialization support for dataclasses, ensuring they are correctly captured as inputs and outputs in observability logs.
    • Added automated tests to verify serialization behavior and prevent regressions.
  • 2024
    OpenLLMetry PR
    • Contributed to OpenLLMetry by fixing a TypeError in the OpenAI embeddings metrics handler caused by comparisons between NoneType and integers; implemented proper handling of None values with error logging.
    • Added automated tests to validate the fix and ensure the robustness of embeddings metrics processing.
    • Improved overall stability by preventing this error from impacting workflow execution.
  • 2024
    Pinecone Canopy Contribution
    • Contributed to Pinecone Canopy, a Retrieval-Augmented Generation (RAG) framework.

Certifications & Professional Development

  • 2024
    Understanding and Applying Text Embeddings
    DeepLearning.AI
    • A comprehensive short course on the end-to-end development of applications using text embeddings. Key topics included:
      • Fundamentals of creating, understanding, and visualizing embedding spaces.
      • Leveraging embeddings for practical applications like semantic search and retrieval.
      • Building a complete Q&A system (Retrieval-Augmented Generation) using Google's Vertex AI.