Responsibilities

Collaborate with Research teams to understand technologies, adapting and integrating them into codebase.
Develop and implement systems to support the lifecycle of machine learning models, such as data preprocessing, pre-training, post-training, evaluation and so on, especially foundation models.
Participate in or lead design reviews with peers and stakeholders to decide amongst available technologies.
Review code developed by other developers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency).
Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback.
Triage product or system issues and debug/track/resolve by analyzing the sources of issues and the impact on hardware, network, or service operations and quality.
Contribute to research papers and represent MBZUAI at industry conferences and events, showcasing the institution’s cutting-edge HPC and deep learning capabilities and establishing MBZUAI as a global leader in AI research and innovation.
Perform all other duties as reasonably directed by the line manager that are commensurate with these functional objectives.

Software Engineer, HPC / Deep Learning