logo

NVIDIA

Senior Staff Engineer, Platform & Infrastructure

Department
Engineering
Job Type / Location
Santa Clara
Experience Required
5+ years
Posted On

About the Role

NVIDIA is seeking a highly skilled and experienced Senior Staff Engineer for our Platform & Infrastructure team. In this role, you will be instrumental in designing, building, and maintaining our next-generation cloud infrastructure and platform services. You will work on highly scalable, reliable, and performant systems that support a wide range of NVIDIA's groundbreaking products and services.

What You'll Be Doing

  • Design, develop, and maintain highly scalable, reliable, and secure cloud infrastructure and platform services (IaaS, PaaS, SaaS).
  • Lead the adoption and implementation of cloud-native technologies such as Kubernetes, Docker, and serverless architectures across various cloud providers (AWS, Azure, GCP).
  • Develop and implement automation tools and frameworks for infrastructure provisioning, configuration management (e.g., Terraform, Ansible), and application deployment.
  • Optimize system performance, scalability, and cost-efficiency through continuous monitoring, analysis, and tuning.
  • Collaborate closely with product and engineering teams to understand their requirements and provide robust infrastructure solutions.
  • Establish and promote best practices for security, reliability, and operational excellence.
  • Mentor junior engineers and contribute to the overall growth and technical excellence of the team.
  • Troubleshoot complex issues across the entire stack, from infrastructure to application level.
  • Participate in on-call rotation to ensure the continuous availability and performance of critical systems.

What We Need to See

  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field (or equivalent experience).
  • 10+ years of experience in software development, with at least 5 years focused on platform engineering, infrastructure development, or site reliability engineering roles.
  • Deep expertise in at least one major cloud platform (AWS, Azure, or GCP) and familiarity with others.
  • Proficiency in containerization technologies (Docker, Kubernetes) and orchestration.
  • Extensive experience with Infrastructure as Code (Terraform, CloudFormation) and configuration management tools (Ansible, Chef, Puppet).
  • Strong programming skills in one or more languages: Python, Go, Rust, C++, Java, JavaScript.
  • Solid understanding of distributed systems, microservices architectures, and their challenges.
  • Experience with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI, GitHub Actions).
  • Familiarity with monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK stack).
  • Excellent problem-solving skills and a strong ability to diagnose and resolve complex technical issues.
  • Strong communication and collaboration skills, with the ability to work effectively in a fast-paced, dynamic environment.
  • Proven leadership experience, including mentoring engineers and leading technical projects.

Ways to Stand Out From the Crowd

  • Experience with database technologies (SQL and NoSQL).
  • Knowledge of networking concepts and security best practices in cloud environments.
  • Contributions to open-source projects.
  • Experience with large-scale data processing or machine learning infrastructure.

View Assessment Process

Think you'll be a good fit?