About the Team

Razorpay's agentic products are scaling fast. Agent-powered onboarding, dashboard experiences, and automation workflows are live and growing, and the engineering teams building them are moving quickly. As we scale, we need shared infrastructure that lets every product team access the best model for each task without having to build routing, observability, or resilience logic themselves. We are building an AI Inference Platform: a centralized layer that handles intelligent model routing, cost optimization, and quality observability across all of Razorpay's agentic products. The goal is simple. Product teams declare what kind of task they need done. The platform takes care of picking the right model, managing fallbacks, and tracking cost and quality. Teams stay focused on shipping features.

About the Role

As Lead AI Engineer on the platform side, you build the inference orchestration layer that sits between our product teams and their model providers. Routing, fallbacks, cost tracking, A/B testing for model swaps, and observability are all yours. Your customers are internal engineering teams, and your job is to give them a single reliable interface to every model the org uses while keeping the complexity of that routing layer off their plates. You also own the observability and eval-in-production infrastructure: standardized tracing and cost dashboards across all agentic products, and the shadow-testing infrastructure that lets us validate model swaps safely before they reach production traffic.

What You’ll Build

Build and operate a unified model gateway that abstracts provider complexity for product teams. Teams work with a clean interface; the platform handles routing, provider selection, and fallback logic under the hood.
Design and implement intelligent routing that matches each request to the right model based on task complexity, latency requirements, and cost targets. Not every call needs the same model.
Build resilience into the platform so provider outages, rate limits, and latency spikes are handled transparently. Agentic workflows stay up regardless of what happens upstream.
Own the observability layer across all AI-powered products: cost per call, latency distributions, token usage, and quality signals. Give product teams and leadership a clear view of how AI is performing and what it costs.
Build the infrastructure for safe model transitions: run new models alongside production, compare outputs, and roll out changes gradually with automated quality checks at every stage.
Drive continuous cost efficiency through caching strategies, request optimization, and per-team spend attribution so the org can scale AI usage without costs growing linearly with traffic.

What We’re Looking For

5 to 8 years as a backend or platform engineer, with a track record of building API gateways, middleware, or developer platform services at scale. Strong in Go or Python.
Experience building high-availability, low-latency distributed systems: load balancing, circuit breakers, graceful degradation, retry logic, and observability using Prometheus, Grafana, OpenTelemetry, or equivalent.
Solid understanding of LLM APIs and token economics. You can design routing rules based on input/output token pricing, streaming vs. batch tradeoffs, and how prompt length affects both cost and latency.
You think in platform terms. You know the difference between building for end users and building for engineers, and you know that internal platform quality shows up in other teams’ velocity.
Familiarity with LLM orchestration and observability tooling: LiteLLM, Portkey, Langfuse, LangChain, or similar. You do not need to have used all of them, but you need to understand the landscape well enough to make good choices.
Experience with Kubernetes and distributed systems. GPU workload scheduling or ML serving infrastructure is a meaningful bonus.

Why This Role is Different

The platform you build becomes the backbone of every AI-powered product at Razorpay. Good infrastructure decisions here compound across every team and every workflow that ships on top of it.
You work on real scale from day one. The problems are concrete, the feedback loop is tight, and the impact of what you build shows up in production metrics quickly.
This role combines deep platform engineering with the emerging discipline of LLM infrastructure. It is a rare combination that puts you at the leading edge of how AI systems are built in production.

Lead AI Engineer

About the Team

About the Role

What You’ll Build

What We’re Looking For

Why This Role is Different

View Assessment Process

Think you'll be a good fit?