5+ years full-stack experience, with 2+ years building LLM/AI applications in production.
Proven experience architecting RAG systems: chunking, embeddings, vector stores, retrieval strategies.
Strong proficiency in Python for AI/ML workloads and TypeScript/Node.js for full-stack development.
Deep understanding of distributed systems, state management, and workflow orchestration.
Strong REST API design skills: versioning, pagination, idempotency, streaming (SSE), and schema-first documentation (OpenAPI).
Hands-on experience serving LLMs in production with vLLM, liteLLM, or comparable stacks (TGI, SGLang, Ray Serve).
Experience with production ML systems: monitoring, evaluation, versioning, and deployment.
Proficiency with AI-assisted development tools (Cursor, Claude Code, GitHub Copilot, or similar).
Application Confirmation
You're applying for the role below: