Prasad Thete | AI & Data Science Portfolio

01. About Me

AI/ML Engineer specializing in Generative AI and Large Language Model (LLM) systems, with a B.E. in Artificial Intelligence & Data Science (SPPU, CGPA: 8.14 — First Class with Distinction).

I design and deploy production-grade Retrieval-Augmented Generation (RAG) architectures, fine-tuned LLM pipelines, and scalable AI systems built for real-world performance and reliability. My experience spans LLM fine-tuning (LoRA, QLoRA, 4-bit quantization), vector databases (Qdrant/FAISS), large-scale document processing, and end-to-end AI deployment.

I have hands-on expertise in:

• Architecting multi-stage RAG systems with intelligent retrieval, confidence handling, and fallback logic
• Optimizing LLM inference and GPU utilization for efficient large-model training and serving
• Designing scalable data ingestion and embedding pipelines for high-volume document datasets
• Building robust, containerized backend APIs for AI-powered production applications

02. Education & Experience

Education

B.E. in Artificial Intelligence & Data Science

MET Institute of Engineering, Nashik

2021 – 2025

CGPA: 8.14 / 10.00 — First Class with Distinction

Experience

AI/ML Engineer

HonchoMinds — Pune, Maharashtra

Full-time On-site

Jan 2025 – Present

BharatNyay.ai — AI-Powered Legal Intelligence Platform bharatnyay.ai

Engineered a production-grade RAG system for large-scale legal document search and contextual Q&A.
Built automated data crawling, OCR extraction, and end-to-end data pipelines (cleaning, chunking, embeddings) for legal datasets (1950–2020).
Implemented multi-stage retrieval strategies (semantic search, metadata filtering, re-ranking) using vector databases.
Performed instruction tuning and advanced prompt engineering on OpenAI models for domain-aligned responses.
Designed hallucination mitigation and confidence-based fallback mechanisms for reliable outputs.
Developed backend services using FastAPI, integrating LLM inference and retrieval pipelines.
Deployed and optimized GPU-based AI workloads on Linux VM environments.

BharatNyay RAG Automation & Ingestion Platform Internal AI Operations System

Architected an end-to-end RAG automation platform to eliminate manual extraction, cleaning, and vector DB updates.
Built automated data ingestion pipelines (PDF/JSON/CSV) with OCR processing, embedding generation, and vector database integration.
Implemented staging-to-production vector DB synchronization pipelines for controlled data deployment.
Developed RAG monitoring & log management system for vector database performance tracking.
Engineered automated government data crawling pipelines and dynamic dataset update workflows.
Designed backend services and dashboard using FastAPI, with user management, subscription tracking, token monitoring, and Linux VM infrastructure control.

Agentice — LLM-Powered ITSM Automation Platform (Ivanti Integration) Enterprise AI Automation System

Built an enterprise LLM-based chatbot automation system for Ivanti ITSM, automating Knowledge, Incident, Problem, Change, and Request modules via structured function calling.
Processed and cleaned custom function-calling datasets, performing advanced data preprocessing and formatting for instruction tuning.
Fine-tuned LLaMA 3 (8B) and Mistral 7B Instruct using LoRA-based PEFT techniques on custom function-calling data.
Trained models on NVIDIA A100 (80GB GPU) with optimized fine-tuning workflows and quantization-aware configurations.
Validated and merged LoRA adapters into standalone production-ready models.
Deployed fine-tuned models using Ollama and vLLM, integrating inference APIs into enterprise backend systems.
Engineered structured JSON function-calling enforcement and response validation mechanisms for reliable tool execution.

03. Projects

Patent Filed Final Year Project | B.E. AI & DS

Eco Bin Tracker — Smart ML & IoT-Based Waste Segregation System

Design & Utility Patents Secured (College-Sponsored)

• Designed and developed an end-to-end IoT-enabled smart waste monitoring and segregation system using NodeMCU ESP8266 and ultrasonic sensors for real-time bin fill detection

• Built and trained a Random Forest-based ML model for automated wet/dry waste classification using labeled image datasets

• Collected and preprocessed disposal data, integrating sensor inputs to improve segregation accuracy

• Developed backend services using Django, enabling real-time tracking, alert management, and analytics dashboards

• Implemented route optimization logic to assist garbage vehicles with efficient collection of full bins

• Designed an incentive-based citizen engagement mechanism to promote responsible waste disposal in collaboration with Nashik Municipal Corporation

Random ForestIoTNodeMCU ESP8266DjangoUltrasonic SensorsML ClassificationRoute OptimizationAnalytics

Ongoing Industrial AI Quality Inspection | Nash Robotics, Nashik

Offline Edge AI — Copper Tip Defect Detection System

Industrial AI Quality Inspection

• Designed and proposed an AI-powered quality inspection system to replace manual copper tip inspection in robotic spot-welding lines

• Built a fully offline Edge AI application using FastAPI, PyTorch (ResNet18), OpenCV, and SQLite deployable on existing factory PCs

• Developed a real-time computer vision classification model (Good / Defective) for copper tip quality verification

• Integrated the system with PLC-controlled robotic workflows, enabling automated welding or tip-dressing decisions

• Implemented a continuous learning pipeline with background retraining based on verified production feedback

• Enabled secure LAN-based access and authentication for factory monitoring while ensuring fully offline industrial reliability

PyTorchResNet18FastAPIOpenCVEdge AISQLitePLC IntegrationComputer Vision

04. Technical Skills

LLM & Generative AI

Large Language Models (LLMs) Generative AI System Design Retrieval-Augmented Generation (RAG) Agentic RAG Architectures Fine-Tuning (LoRA, PEFT) Instruction Tuning Prompt Engineering & Context Engineering Function Calling & Tool Integration Embedding Models & Vector Search Quantization (QLoRA / 4-bit / 8-bit) vLLM Hybrid Search Re-ranking RAG Evaluation Hallucination Mitigation LLM Evaluation & Optimization Open-Source LLM Deployment (Ollama / vLLM) System Architecture Design (LLM/RAG)

Hands-on Models Worked With

Mistral 7B Instruct LLaMA 3 8B Vision-Language Models (VLM) GPT-based Models (API Integration)

Databases

PostgreSQL SQL MongoDB Vector Databases (Qdrant, FAISS) Redis Embedding Indexing

Data Collection

Web Scraping Selenium BeautifulSoup Requests API-based Data Collection Automated Crawling Systems

Deployment & DevOps

Docker Kubernetes Jenkins (CI/CD Pipelines) Model Deployment & Monitoring Containerized ML Applications Production LLM Serving GPU Optimization Linux Server Management

Machine Learning & Deep Learning

ML Model Development Supervised & Unsupervised Learning Feature Engineering Model Evaluation & Optimization CNN Architectures Transfer Learning ResNet (ResNet-18, ResNet-50) EfficientNet Image Classification Computer Vision OpenCV PyTorch TensorFlow Scikit-learn NumPy Pandas

Core AI Libraries & Frameworks

LangChain Transformers (Hugging Face) PEFT BitsAndBytes TRL PyMuPDF

Computer Vision & Multimodal AI

Convolutional Neural Networks (CNN) Transfer Learning (ResNet-18, ResNet-50, EfficientNet) OpenCV Image Classification Pipelines Vision-Language Models (VLM) Multimodal Data Extraction (Text + Image)

OCR Pipelines

Tesseract OCR PaddleOCR EasyOCR Layout-aware PDF Extraction Unstructured.io Docling

Data Engineering & Pipelines

End-to-End AI/ML Pipelines Data Extraction & Cleaning Data Preprocessing & Normalization Chunking & Tokenization Embedding Generation RAG Data Ingestion Pipelines Vector Index Creation JSONL Dataset Generation Large-Scale PDF Processing ETL Workflows

Backend & Web Frameworks

FastAPI Django Flask REST API Development Async APIs Authentication & Authorization SSH & Secure Server Deployment

05. Certifications & Training

Full-time Training

Machine Learning Engineer

Symbiosis Skills & Professional University (SSPU), Pune

Completed an intensive full-time Machine Learning Engineer Training Program under the Future-ready Skills Training Project, conducted under the aegis of Symbiosis Open Education Society.

Jun 15, 2024 – Aug 31, 2024 Full-time

Training Highlights

Internship

ML Engineer Intern

HonchoMinds — Pune, Maharashtra

Worked on Fine Tuning pre-trained models and Large Language Models (LLMs) for domain-specific tasks. Performed data preprocessing and feature engineering. Built prototypes using cutting-edge ML frameworks.

Feb 2025 – May 2025 · 4 mos On-site

Fine-TuningLLMsData PreprocessingFeature EngineeringML Frameworks

Internship

IBM SkillsBuild Summer Internship

CSRBOX × IBM SkillsBuild

Completed 6 weeks of IBM SkillsBuild Summer Internship Program on Data Analytics, hosted from 24th June 2024 to 5th August 2024.

Jun 24, 2024 – Aug 5, 2024 Data Analytics

SQL Certification

Professional Certification

Python Libraries for Data Science

Professional Certification

06. Leadership & Initiatives

First President

Student Association of Artificial Intelligence & Data Science (AISA)

2024 – 2025 MET Institute of Engineering, Nashik BE AI & DS | Batch 2025

First President & Founding Member of the AI & DS Student Association at MET IOE
Led the AISA committee of 40+ students from the department, coordinating all departmental activities
Organized technical workshops, AI seminars, and hackathons for the department
Coordinated with faculty and industry mentors for knowledge-sharing sessions
Significantly increased student participation in AI/ML activities across the college
Represented the department in academic and technical events

07. Get In Touch

I design and deploy production-grade AI systems across Generative AI, RAG, and Industrial Automation.

If you’re interested in collaboration, technical discussions, or innovative AI solutions, feel free to connect.

theteprasad55@gmail.com linkedin.com/in/prasadthete github.com/prasad7588