About the Role
At NVIDIA, we pride ourselves on data-driven decision-making, and the data science platform team is at the heart of this initiative. We are looking for an excellent Sr. ML Platform Engineer with expertise in AI, MLOps, cloud computing, and GPU acceleration! Our platform serves as the basis for advanced real-time data analytics, streaming, data lake and sophisticated ML/AI training with offline/online inferencing for NVIDIA's cloud services like Cloud gaming, Cloud Deep Learning, Autonomous Vehicles, Omniverse etc. As a ML Platform Engineer, you'll design and build enterprise-level AI solutions using groundbreaking NVIDIA technology. You'll work with internal engineering teams to deploy and operationalize AI at scale by driving adoption for end-to-end Machine Learning and Deep Learning solutions in the cloud!
What You'll Be Doing
- Build and deploy AI/ML solutions at scale using NVIDIA's AI software on cloud-based GPU platforms
- Using your skills in AI, MLOps, ML engineering, DevOps, Kubernetes, and orchestration to deploy serverless solutions
- Creating microservices for task-specific AI cloud services
- Improving service reliability, observability, develop UI and APIs to improve user experience
What We Need To See
- 5+ years of foundational expertise in Engineering, Computer Science, Data Science, or a related field
- BS or MS in Engineering, Mathematics, Physics, Computer Science, or equivalent experience
- Basic understanding of ML/DL training and inferencing concepts
- Established track record working with AI/MLOps GPU accelerated solutions in cloud computing environments including AWS, GCP, and Azure
- Experience with virtualization and cluster management tools, including Docker/Containers, Kubernetes
- Strong analytical and problem-solving skills
- Ability to multitask efficiently in a wide-ranging environment
- Clear written and oral communication skills with a strong desire to share knowledge with clients, partners, and co-workers
Ways To Stand Out From The Crowd
- Strong coding and debugging skills, including experience with Python, Java, Go
- Proven expertise through projects or Open Source contributions in cloud-based GPU workloads, Kubernetes, or other related areas
- Experience with AI frameworks and tools on GPUs
- Background with serverless computing