Key Responsibilities
- Drive innovation in ROCm software stack for next-generation AI products across AMD’s portfolio, including Instinct, Radeon, Ryzen, and Embedded systems
- Lead hardware-software co-design efforts to maximize performance, efficiency, and scalability for diverse AI workloads
- Optimize large-scale AI models (LLMs, Diffusion, Multimodal, MoE) for out-of-the-box performance on AMD hardware
- Develop advanced tools for performance estimation, modeling, and automated reporting
- Collaborate with top customers and hyperscalers to deliver tailored architectural wins and software optimizations
- Mentor engineers and contribute to ROCm open-source initiatives
Requirements
- Deep expertise in AI/ML software stack (compilers, kernels, runtime, libraries, frameworks)
- Strong background in GPU programming (ROCm, CUDA, OpenCL) and hardware-software co-design
- Experience with GPU/CPU architectures and performance optimization
- Undergraduate degree in Computer Science, Electrical Engineering, or related field; advanced degrees preferred
- Proven track record of technical leadership and cross-organizational collaboration