logo

Databricks

Senior Engineering Manager, Model Serving

Department
Engineering
Job Type / Location
San Francisco
Experience Required
5+ years
Posted On

About the Role

As a Senior Engineering Manager, you will lead the team owning both the product experience and the foundational infrastructure of Model Serving — shaping customer-facing capabilities while designing for scalability, extensibility, and performance across both CPU and GPU inference — and collaborate closely across the platform, product, infrastructure, and research organizations.

The impact you will have:

  • Lead, mentor, and grow a high-performing engineering team responsible for both the customer-facing Model Serving product and its foundational infrastructure — covering runtime, APIs, scaling, reliability, and integrations.
  • Define and own the product and technical roadmap for Model Serving, balancing customer experience, functionality, and foundational investments across deployment, inference, monitoring, and scaling.
  • Collaborate closely with product, research, platform, and infrastructure teams to drive end-to-end delivery — from ideation and prioritization to launch and operation.
  • Ensure Model Serving meets stringent SLAs, SLOs, and performance and reliability goals, continuously improving operational efficiency and customer experience.
  • Drive architectural decisions and product design around latency, throughput, autoscaling, GPU/CPU placement, and cost optimization.
  • Advocate for customer needs through direct engagement, ensuring engineering decisions translate to clear product impact.
  • Promote best practices in code quality, testing, observability, and operational readiness.
  • Foster a culture of excellence, inclusion, and continuous improvement across the team.
  • Partner with recruiting to attract, hire, and develop top-tier engineering talent.

What we look for:

  • 5+ years of experience in technical leadership or management.
  • Proven track record building and operating large-scale distributed systems, preferably real-time or low-latency APIs.
  • Deep understanding of real-time serving systems.
  • Experience driving architectural design and operational excellence for production systems with measurable SLAs and SLOs.
  • Familiarity with CPU/GPU performance optimization, concurrency, caching, and scalability concepts.
  • Excellent collaboration and communication skills across engineering, product, and research organizations.
  • Ability to lead teams through ambiguity and deliver complex, cross-functional projects.
  • BS in Computer Science (Masters or PhD Preferred).

View Assessment Process

Think you'll be a good fit?