About the Role
We are looking for a Senior Machine Learning Engineer to lead the architectural evolution of our safety systems. You will move our ML stack from siloed, end-to-end models toward a unified Perception Platform Layer. Your mission is to build the robust infrastructure that translates raw sensor data into real-time, high-stakes decisions, ensuring our models perform reliably across both cloud and edge environments.
This is a remote position for candidates based in the US.
In this role, you will work on the following:
1. Platform Architecture & Unification
- Architect a Unified Perception Layer: Lead the transition from fragmented, task-specific models to a modular perception platform that supports reusable components and downstream safety applications.
- System Design: Design and implement real-time ML systems—from sensor ingestion and tracking to risk reasoning and actuation—ensuring clear interfaces and predictable system behavior.
- Hybrid Deployment: Orchestrate model integration across edge and cloud environments, managing versioning, rollouts, and mission-critical fallback mechanisms.
2. Performance & Reliability Engineering
- Latency Ownership: Own end-to-end latency and reliability for safety-critical pipelines. You will profile, schedule, and optimize messaging and backpressure across the entire stack.
- Observability & Feedback Loops: Build sophisticated monitoring for deployed models to detect drift, false positives/negatives, and latency regressions. You will "close the loop" to ensure production data informs the next iteration of training.
3. Rigorous Evaluation & Safety
- Safety Cases: Develop evaluation frameworks specifically for rare "long-tail" safety events. You will define metrics and build targeted test sets that form the basis for principled ship/no-ship decisions.
- Explainability: Partner with Applied Scientists to ensure research outputs are translated into production code that is not only performant but also debuggable and explainable.
4. Technical Leadership
- Strategic Influence: Shape the system abstractions early in the platform transition to minimize technical debt and maximize future scalability.
- Mentorship: Set the engineering standard for correctness and performance. You will mentor junior and mid-level engineers, fostering a culture of rigorous ML engineering.
Minimum requirements for the role:
- Experience: 6+ years of experience in ML Engineering, with a proven track record of shipping models in production (ideally in safety-critical domains like robotics, automotive, or industrial AI).
- Systems Mastery: Deep understanding of distributed systems, performance profiling, and computer vision.
- Infrastructure Fluency: Experience with Cloud ML workflows (AWS/GCP/Azure) and containerization, paired with an understanding of the constraints of edge hardware.
- Architectural Mindset: You don't just write code; you design systems. You understand the trade-offs between model complexity and operational reliability.
An ideal candidate also has:
- Ph.D. in Computer Science or quantitative discipline (e.g., Applied Math, Physics, Statistics)
- Experience with containerization technologies (e.g., Docker, Kubernetes), continuous integration/continuous deployment (CI/CD) pipelines, and infrastructure-as-code (IaC) frameworks
- Familiar with deploying and managing ML applications in cloud environments, as well as leveraging cloud-based services for data storage, processing, and inference
- Experience building end-to-end ML applications from scratch