Portfolio of Aditya Singh Khichi, a Full Stack Engineer building data platforms and AI systems (React, Node, Go, AWS). View case studies, including Conqr AI, Human Archive and Thunder Forms.
Building data platforms and AI-powered systems at a YC W26 startup. I ship production-grade pipelines, RAG engines, and full-stack products — from React to PostgreSQL to AWS.
Projects
Thunder Forms — State-driven form builder with drag-and-drop composition, AI-assisted creation, and a deep analytics pipeline — shipped as a production SaaS with 10K+ visits. Existing form builders force you to choose: either a clean drag-and-drop UI with shallow analytics (Typeform), or deep analytics with no first-class form composition (Posthog). Builders shipping a quick survey or lead-gen form had to glue together 3+ tools to get session-aware metrics. Tech Stack: Next.js, TypeScript, SSE, PostgreSQL, NextAuth.js, Shadcn UI.
Fantastic Robo — High-throughput, multi-format ingestion pipeline with adaptive extraction, OCR, semantic chunking, and a production-grade LLM load balancer for resilient RAG service levels. RAG pipelines that work on PDFs fall apart the moment users upload DOCX, PPTX, XLSX, scanned images, or email exports. Each format needs a different extractor, a different chunking strategy, and a different OCR fallback. Most teams ship a PDF-only MVP and accumulate technical debt every time a new format is requested. Tech Stack: Docker, Sentry, Vector Embeddings, Mistral OCR, CI/CD, DigitalOcean.
IOSD MAIT Website — Official website for IOSD MAIT — the largest technical society at Maharaja Agrasen Institute of Technology. Central hub for events (IMPULSE), member projects, and recruitment. IOSD MAIT didn't have a single canonical web presence. Information about events like IMPULSE, the projects members shipped, and how new students could get involved was scattered across WhatsApp threads, Notion pages, and Google docs. Prospective members would ask 'what does IOSD actually do?' and there wasn't a single link to send them. Tech Stack: Next.js, TypeScript, Tailwind CSS.
Wave Linux — Minimal Linux distribution built from torvalds/linux + uutils/coreutils. Build script fetches the kernel, compiles BusyBox and ~100 Rust coreutils with musl static linking, and packages a bootable image. 'Build your own minimal Linux' tutorials skip the actually-hard parts: how do you pin a kernel version, fetch the source, configure it for x86_64-musl, and link a Rust-rewritten coreutils into a bootable image without inheriting a distro's assumptions? The knowledge is scattered across 17 stale blog posts. There wasn't a single repo that does it end-to-end. Tech Stack: Rust, Linux Kernel, BusyBox, musl, uutils, Shell.
RaghavOS — Bootloader written from scratch in C and x86 assembly. Boots from BIOS, transitions into a custom runtime, and serves as a forcing function for understanding what the OS layer actually does for you. Modern engineers spend their careers above 5+ layers of abstraction (frameworks, runtimes, kernels, libc) and rarely see the moment where the BIOS hands the CPU to your code. That's the moment everything else is built on top of, and treating it as a black box leaves a gap in the mental model of what 'the computer is doing right now' actually means. Tech Stack: C, x86 Assembly, BIOS, QEMU.
Never Remember — Personal password manager. NextAuth for OAuth, Upstash Redis for storage, search + copy-paste UX. 'You don't have to remember 235 passwords.' The average person has dozens of accounts with weak or recycled passwords. Every commercial password manager wants a subscription. I wanted a self-hostable, free-tier-friendly password manager I'd actually use myself, where the architecture was simple enough to audit in an afternoon. Tech Stack: Next.js, TypeScript, NextAuth.js, Upstash Redis, Tailwind CSS.
SkyCast — Full-stack weather + COVID dashboard. Next.js frontend consumes a Python backend on a DigitalOcean droplet. Location-aware via browser geolocation. Most weather dashboards call public APIs directly from the browser, which leaks API keys and ties the UX latency to the upstream provider's response time. I wanted a dashboard where the frontend never sees an upstream API key, and the backend can swap providers, batch requests, or cache responses without redeploying the frontend. Tech Stack: Next.js, TypeScript, Python, DigitalOcean, REST.
Krishi Sahayak AI — Multilingual AI assistant for Indian smallholder farmers. Hyperlocal weather, image-based pest/disease diagnosis (CNN), market price forecasting, alternative credit scoring, parametric crop insurance. Voice-first for low-literacy users. Indian smallholder farmers face a stack of compounding problems: no timely hyperlocal weather forecasts, no quick pest/disease diagnosis, opaque mandi prices, financial exclusion from formal credit due to lack of land collateral, slow insurance payouts. Existing apps either solve one of these in isolation, or assume an English-literate user comfortable navigating a smartphone-first menu UI. Neither matches the actual user. Tech Stack: Next.js, TypeScript, Python, Gemini 1.5 Pro, Groq, CNN, PWA.
Encephalon Lab — MCP-aware AI agent playground. Chat with Gemini-backed agents that have plug-and-play tool access — Alpaca for trading, GitHub for repos, Slack/Notion for work. SSE streaming, model picker, MCP server registry. ChatGPT-style chat UIs don't naturally compose with the Model Context Protocol. You either hardcode tool integrations into the prompt scaffolding, or you spin up separate agents per tool. Neither matches how engineers actually want to use agents in practice: pick a model, pick which tools it has access to this session, then chat. Tech Stack: Next.js, TypeScript, LangChain, Gemini 2.5, MCP, Prisma.
Lexa AI — OpenAI variant of Encephalon Lab. Same MCP-aware agent shell, GPT-4-class models behind the chat. Built specifically to compare whether MCP agent UX feels different across model providers. Two big questions when picking a model for an MCP agent platform: does the model's native tool-calling matter more than the surrounding orchestration, and does swapping providers break user behavior in subtle ways? You can't answer either by reading benchmarks — you have to run the same UX on both and compare. Tech Stack: Next.js, TypeScript, LangChain, OpenAI, MCP, Prisma.
Human Archive Data Platform — Built Human Archive's (YC W26) enterprise data platform — TB-scale robotics dataset delivery with Cognito multi-tenant auth, S3 signed URLs, and recursive folder resolution across AWS. Human Archive (YC W26) needed an enterprise platform to deliver multimodal robotics datasets at terabyte scale. Customers wanted role-gated access to specific dataset slices, recursive S3 folder structure preserved end-to-end, and auth that supported multiple organizations without leaking data across tenants. Off-the-shelf data portals couldn't handle the storage scale or the multi-tenant boundary. Tech Stack: React, TanStack Router, Express, PostgreSQL, AWS.
Conqr AI Legal Chatbot — RAG-powered legal chatbot with end-to-end document pipeline — scan detection, OCR, chunking, and Go-powered parallel processing. Legal teams onboard with batches of contracts, briefs, and exhibits — typically 20-30 PDFs at once. The naive single-threaded pipeline took 5+ minutes per batch, which felt broken to a user who'd just dragged files in and was watching a spinner. Worse, mixed batches (some text-layer PDFs, some scans) had to be processed at the slower rate of the scanned ones because OCR ran inline. Tech Stack: Go, RAG, PDF.js, OCR, Vector Embeddings.
Bynd — AI News Intelligence — News aggregator and summarization pipeline tracking the major AI labs (OpenAI, Anthropic, DeepMind, Microsoft, Meta). Six-step Python pipeline running fully on local LLMs. Bynd needed to track news coverage of the major AI labs (OpenAI, Anthropic, DeepMind, Microsoft, Meta) at scale. The off-the-shelf options were expensive per-seat news intelligence platforms or hand-rolled Google Alerts that produce noise, not signal. They wanted structured, summarized output they could feed into downstream analysis without paying per-article fees forever. Tech Stack: Python, Ollama, trafilatura, SQLite, RSS.
Hello, world — A sample post that shows off the frontmatter fields and basic Markdown.
Experience
Full Stack Engineer at Human Archive (YC W26) (Feb 2026 - Present) — Building the company's data platform for delivering TB-scale robotics datasets to enterprise customers.
Full Stack Engineer at Conqr AI (May 2025 - Jan 2026) — Built a legal chatbot using RAG and engineered an end-to-end document processing pipeline.
Open Source Contributor at Spacedrive (Aug 2023 - May 2025) — Implemented 9 key features across the Spacedrive open-source file manager.