About the Role
As a Senior Engineer, you’ll play a critical role in shaping both the product experience and the foundational infrastructure of Model Serving at Databricks. You will design and build systems that enable high-throughput, low-latency inference across CPU and GPU workloads, influence architectural direction, and collaborate closely across platform, product, infrastructure, and research teams to deliver a world-class serving platform.
The impact you will have:
- Design and implement core systems and APIs that power Databricks Model Serving, ensuring scalability, reliability, and operational excellence.
- Drive architectural decisions and trade-offs to optimize performance, throughput, autoscaling, and operational efficiency for CPU and GPU serving workloads.
- Contribute directly to key components across the serving infrastructure — from model container builds and deployment workflows to runtime systems like routing, caching, observability, and intelligent autoscaling — ensuring smooth and efficient operations at scale.
- Collaborate cross-functionally with product, platform, and research teams to translate customer needs into reliable and performant systems.
- Lead technical initiatives that improve latency, availability, and cost-effectiveness across both customer-facing and foundational serving layers.
- Establish best practices for code quality, testing, and operational readiness, and mentor other engineers through design reviews and technical guidance.
What we look for:
- 5+ years of experience building and operating large-scale distributed systems.
- Experience in model serving, inference systems, or related infrastructure (e.g., routing, scheduling, autoscaling, and observability).
- Strong foundation in algorithms, data structures, and system design as applied to large-scale, low-latency serving systems.
- Proven ability to deliver technically complex, high-impact initiatives that create measurable customer or business value.
- Experience building architecture for large-scale, performance-sensitive CPU/GPU inference systems.
- Strong communication skills and ability to collaborate across teams in fast-moving environments.
- Customer-focused mindset with the ability to align implementation details with product goals.
- Passion for mentoring, growing engineers, and fostering technical excellence.