$ initializing
Available for AI/ML Engineering roles

Arsalan Shaikh

I ship production LLM and ML systems end-to-end. Currently at Adastraa AI, building an Amazon Ads LLM copilot (Azure OpenAI + vector memory + function calling) and a real-time XGBoost bidding engine processing 50K-100K events/day. Focused on cost-aware AI engineering, not weekend prototypes.

Arsalan Shaikh
where i work now

Current Role

Machine Learning Engineer

Adastraa AI
Dec 2025 — Present
Lead AI/ML engineer at an ad-tech startup. Owned both systems below end-to-end: architecture, build, deployment, and production monitoring.
PROJECT 01

Ads AI Copilot — Streaming LLM with Vector Memory & Function-Calling

  • Built an Amazon Ads copilot on Azure OpenAI (gpt-4o-mini + o4-mini reasoning), Express, and MongoDB Atlas for natural-language querying of live campaign data.
  • Designed a 3-tier memory system (Redis short-term, MongoDB episodic with 30-day TTL, Atlas Vector Search with 0.92 cosine dedup) that bounds per-user memory to ~200 facts regardless of usage.
  • Added intent-based model routing (gpt-4o-mini vs o4-mini) and 7 LLM-callable MongoDB tools via function calling, cutting per-turn token cost from ~2,264 to ~800.
  • Implemented SSE streaming, prompt-injection defense (17 input patterns + output sanitizer), and index-level per-user isolation via $vectorSearch pre-filters.
PROJECT 02

Autopilot — Real-time Bidding Engine for Amazon Sponsored Products

  • Replaced rule-based bidding with an XGBoost model predicting daily base bids and hourly adjustments, retrained daily on 90 days of Amazon Ads reports and live AMS streaming events (50K-100K/day).
  • Built an anomaly-detection guardrail to reject outlier bid predictions before live execution, preventing runaway ad spend.
  • Deployed on AWS (EC2, Amplify) with automated daily retraining; live in production on client campaigns.
Python Azure OpenAI MongoDB Atlas Vector Search Redis XGBoost Node.js Express AWS (EC2, Amplify) SSE Streaming
where i've worked

Professional Experience

Oct 2025 — Dec 2025
Gen AI Engineer (Remote)

Journalyst

  • Built a real-time voice-driven AI coach using Groq Llama 3.3, Whisper ASR, and PlayAI TTS, with a Pinecone RAG pipeline grounded on user session history.
  • Designed a custom Voice Activity Detection (VAD) pipeline for clean speech segmentation across variable speaking patterns.
  • Implemented intent-based prompting that routed user queries through different response strategies based on speech and behavior signals.
  • Shipped the full stack: Flask backend + React frontend with low-latency audio streaming.
Sep 2025 — Oct 2025
Software Engineer Intern (Remote)

Dictation Daddy

  • Fine-tuned OpenAI Whisper Large for domain-specific transcription, improving accuracy on production audio.
  • Optimized inference pipeline, reducing latency by 40% (1s → 600ms) for near real-time response.
  • Built REST APIs with monitoring, logging, and fault tolerance for reliable integration with the main product.
  • Integrated Gemini 1.5 for adaptive text formatting (formal / informal) based on transcript context.
recognition

Key Achievements

Mumbai Hacks 2025 — Finalist

Top finalists out of 15,000+ participants

15,000+ participants nationwide

Selected from a pool of over 15,000 participants across India to compete in the finals.

Led team as technical decision-maker, owning architecture and problem-solving strategy across all phases.

Delivered the final pitch presentation, demonstrating a working AI solution to judges.

academic background

Academic Background

B.Tech in AI & Data Science

MIT Aurangabad, Maharashtra

2021 — 2025
CGPA: 7.5 / 10
technical stack

Technical Skills

Generative AI / LLMs 01

GPT-4o-mini o4-mini reasoning Llama 3.3 Whisper ASR PlayAI TTS Gemini 1.5 Function Calling RAG Fine-tuning Prompt Engineering SSE Streaming Voice AI / VAD

Vector Search & Retrieval 02

Pinecone MongoDB Atlas Vector Search FAISS BM25 text-embedding-3-large Cosine Similarity Semantic Search

Machine Learning 03

XGBoost PyTorch TensorFlow Scikit-learn HuggingFace Transformers BERT Pandas NumPy Anomaly Detection Time Series Feature Engineering

Backend & Languages 04

Python JavaScript TypeScript FastAPI Flask Node.js Express REST APIs React

Databases & Caching 05

MongoDB Redis MySQL SQL

Cloud & MLOps 06

AWS (EC2, S3, Amplify) Azure OpenAI Docker MLflow DVC CI/CD Git / GitHub
side projects

AI Projects

Hybrid Chatbot Gen AI
Hybrid Chatbot (KB + LLM)

A hybrid chatbot combining a structured knowledge base with LLM capabilities. Routes between fast retrieval-based answers and Llama-3.1 (Groq) generated responses depending on intent.

React Flask Groq Llama 3.1 Python
AI Interview Platform NLP
Aimers — AI Interview Platform

BERT-based evaluation system for assessing candidate interview responses with sentiment analysis, answer relevance scoring, and feedback generation.

Python BERT FastAPI TF-IDF
VeriScore AI - Loan Risk Prediction ML
VeriScore AI — Loan Risk Prediction

End-to-end loan risk prediction system using XGBoost, deployed with FastAPI and a custom UI. Versioned with DVC and tracked via MLflow for full pipeline reproducibility.

XGBoost FastAPI MLflow DVC
Iris Tumor Detection CV
Iris Tumor Detection (CNN)

CNN model for medical image classification with image augmentation pipeline and hyperparameter tuning. Achieved 92% accuracy on held-out validation data.

CNN TensorFlow OpenCV Medical AI
Movie Recommendation NLP + ML
Movie Recommendation System

Content-based movie recommender using NLP and cosine similarity. Integrated with IMDB API for live movie metadata and deployed via Streamlit.

NLP Cosine Similarity IMDB API Streamlit
download

My Resume

The Full Engineering Story

Download my latest resume for a complete view of my work history, technical depth, and projects. Updated with my current Adastraa AI work.

get in touch

Let's Talk

Build something with AI?

Open to AI/ML Engineer roles, freelance LLM work, and collaborations on production AI systems. Drop a message or reach out directly.