X-Scrappy · Xuno
Built Xuno's internal rate-intelligence platform to scrape competitor rates, compute buffered pricing, and publish auditable exchange rates to downstream systems.

I'm Aayush Shah. Don't read a bio — read the theorem I'm proving and the repo I push to at 2 AM.

I believe the best algorithms are just mathematics made executable — and I want to keep closing that gap.
I grew up obsessed with numbers — prime gaps, Euler's identity, the unreasonable effectiveness of calculus. Today I channel that obsession into ML research and engineering: studying where learned models hit number-theoretic limits, building LLM-powered systems, and asking why gradient descent works as well as it does. I live where proof sketches meet production code.
Currently a Jr. Software Engineer & ML Researcher at Xuno, researching prime number structure at The British College (UWE). I've published in IJRASET on post-quantum cryptography, prime number theory, and AI & creativity. Always building, always proving.
Focus on AI, ML, systems, programming, web development, databases, IoT, and project management.
Foundation in programming, internet tech, multimedia, systems, math, and data science.
Major subjects: Mathematics, Physics, Chemistry, English, Nepali, and Social Studies.
Built Xuno's internal rate-intelligence platform to scrape competitor rates, compute buffered pricing, and publish auditable exchange rates to downstream systems.
Trained ML models on a 5M-integer dataset across 4 tasks — prime classification, twin prime detection, gap regression, and factor count prediction — establishing an Identifiability Limit and Structural Ceiling that validate cryptographic hardness assumptions.
Fine-tuned Mistral 7B Instruct on MentalChat16k and MeQuAD datasets for empathetic healthcare responses. Deployed as a web app via Ngrok with custom NLP training pipelines optimized for sensitive mental health and medical conversations.
A passion-driven platform celebrating the rich cultural heritage of the Mithila region — featuring temples, festivals, art, cuisine, folklore, and literature. Integrates an AI-powered chatbot to guide users through Mithila's history and traditions in an interactive, engaging way.
Awarded Bronze Honour for exceptional performance in the final round of the International Youth Math Challenge 2023.
Successfully completed CS50's Introduction to Artificial Intelligence with Python, offered by Harvard via edX.
Successfully completed CS50's Introduction to Programming with Python, a comprehensive course offered by Harvard via edX.
Completed the Mathematics for Machine Learning and Data Science specialization — covering linear algebra, calculus, probability, and statistics.
Passed the HackerRank role certification for Software Engineer, covering problem solving, SQL, and REST API.
Completed HackerRank's intermediate assessment in Data Structures (HashMaps, Stacks, Queues) and Algorithms (Optimal Solutions).
Mapping prime gaps, twin primes, and factor distributions through ML — probing where learned sieves hit the Identifiability Limit imposed by the Prime Number Theorem and why local prime realizations stay cryptographically hard.
Obsessed with what algorithms can and cannot learn — PAC bounds, VC dimension, bias-variance tradeoffs, and the combinatorics of decision trees. Building models that don't just predict but expose structure: from classification boundaries in high-dimensional space to transformer attention as a learned retrieval algorithm.
Variational calculus, measure theory, and Fréchet derivatives in infinite-dimensional function spaces. Tracing how ∇L = 0 connects Lagrange multipliers to Euler–Lagrange equations to why Adam converges faster than vanilla SGD on non-convex manifolds.
Statistical manifolds, Fisher information metric, and natural-gradient descent — where differential geometry meets Kullback–Leibler divergence. Exploring why e^(iπ) + 1 = 0 is not coincidence: deep symmetry always underlies the best loss landscapes.
AI-powered learning ecosystem for Nepal's secondary education — OCR handwriting digitisation, LLM grading with step-by-step feedback, RAG chatbot, real-time class heatmap, adaptive micro-quizzes, and an IoT smart desk.
Hands-on Jupyter workbook covering blind/heuristic search (BFS, DFS, A*), supervised ML, neural networks, genetic algorithms, knowledge representation, and Bayesian reasoning.
Minimal 32-bit x86 OS from scratch — multiboot bootloader, C kernel, framebuffer text driver, interrupt-driven keyboard input, circular buffer, and an interactive terminal with 15 commands, tab completion, and command history.
Multi-property hotel platform with real-time room tracking, dynamic pricing engine (seasonal, occupancy, peak-season, discounts), 3NF database with 30+ relational tables, and an admin dashboard with analytics & audit logs.
Leading migration of legacy X-Scrappy system to a modular AI-powered architecture. Designing intelligent extraction agents using LLMs to automatically interpret dynamic bank HTML/JS pages, replacing hard-coded scraping logic and reducing maintenance time by 60%. Publishing computed buffered Xuno rates via Kafka and OAuth2 REST to internal services (Goat/Xonnector), with operator controls for monitoring, overrides, and ad-hoc scrapes.
Designed and trained ML models on a 5M-integer dataset to study primality, twin primes, prime gaps, and factor counts. Demonstrated that tree-based models act as learned arithmetic sieves with precision ceilings imposed by number-theoretic limits. Empirically established Identifiability Limit and Structural Ceiling — showing static feature-based ML learns global density trends (PNT, Hardy–Ramanujan) but cannot predict local prime realizations, validating cryptographic hardness assumptions.
Built high-reliability web scrapers (X-Scrappy) for USD→INR and USD→NPR corridors, integrating HDFC, SBI, ICICI, Axis, Western Union, Wise, Remitly and more. Architected async scraping pipelines using Python (aiohttp, Playwright, BeautifulSoup) and PostgreSQL, improving throughput by 300%. Implemented data-validation, error-handling, and Slack alerting, reducing failed scrapes by 40%.
Reviewed 50+ video lectures, notebooks, and programming assignments during testing phase. Verified 30+ resource citations, reported 30+ issues via Git, and suggested 10+ additional labs — directly influencing course content improvements and learner experience.
Managed technical operations for a 2-month TED-Ed event with 200+ attendees, resolving all issues for a seamless experience. Collaborated with designers to produce 10+ promotional posters, boosting event visibility and participant engagement by 30%.
Built 5+ AI apps and APIs using Next.js, OpenAI, Pinecone, and Stripe — achieving 98% accuracy across 1000+ users. Led 4+ engineering fellows using MVC design patterns from design to deployment. Mentored by Amazon, Bloomberg, and Capital One engineers on Agile, CI/CD, and microservices.
Designed 15+ posters and led marketing and social media campaigns, significantly enhancing national visibility. Increased attendee engagement by 20% and made mathematics accessible to youth across diverse geographic and socioeconomic backgrounds.
A living playground — equations I love, a tiny neural net learning in your browser, and a terminal pretending to do real work.
aayush@manifold:~/research$ python train.py --model aether-1.3b --opt natural-grad
Learning Prime Number Structure with ML — classification, regression & multi-class factor analysis on a 5M-integer dataset.
Tree-based models act as learned arithmetic sieves — precision-capped by number-theoretic limits, validating cryptographic hardness assumptions.
Identifiability Limit & Structural Ceiling: ML learns the 'climate' of the number line (density trends) but can't predict the 'weather' (specific primes).
Open to research collaborations, ML & software engineering projects, internships, and open-source work. Whether it's a paper, a product, or a problem worth solving — reach out. I read every message and usually reply within a week.