About the Role
NVIDIA is looking for a Senior Software Engineer to join our ML Performance team. As a member of this team, you'll work with the latest NVIDIA GPUs and SW to deliver the best performance for AI/ML workloads. This includes end-to-end optimizations from various applications (LLMs, Image processing, Vision AI, etc.) to foundational SW (DL Frameworks, compilers, runtimes, OS) and HW. We are passionate about groundbreaking performance and driving world records on industry benchmarks and products, come join our diverse and dynamic team!
What you'll be doing:
- Analyze performance bottlenecks and develop optimizations across the entire SW/HW stack for ML workloads on NVIDIA GPUs.
- Profile, debug, and optimize deep learning (DL) frameworks, libraries, and applications to achieve maximum performance.
- Collaborate with cross-functional teams (HW, Architecture, Driver, Compiler, DL Frameworks) to define and implement performance features and improvements.
- Design and implement new performance benchmarks and tools to evaluate and track ML performance.
- Contribute to the development of novel techniques and algorithms for ML performance optimization.
- Stay up-to-date with the latest advancements in ML, deep learning, and hardware architectures.
What we need to see:
- BS or MS degree in Computer Science, Electrical Engineering, or a related field (or equivalent experience).
- Strong software engineering background with 5+ years of experience.
- Excellent programming skills in C++ and Python.
- Strong understanding of computer architecture, data structures, and algorithms.
- Experience with performance analysis, profiling, and debugging tools.
- Familiarity with machine learning frameworks (e.g., TensorFlow, PyTorch, JAX).
- Solid understanding of deep learning concepts and algorithms.
Ways to stand out from the crowd:
- Experience with large language models, image processing, or other specific ML domains.
- Familiarity with compiler technologies (e.g., LLVM, CUDA compilers).
- Experience with CUDA or other GPU programming models.
- Knowledge of distributed systems and cloud computing platforms.
- Contributions to open-source ML projects.
- Publications in top-tier ML or systems conferences.