As a Sr Principal Reliability Engineer, you will lead the development and implementation of reliability engineering strategies and practices to ensure the high availability and performance of complex systems. You will work closely with cross-functional teams to identify and mitigate potential reliability risks, and develop and implement solutions to improve system reliability. You will also be responsible for developing and maintaining reliability models, and providing technical guidance and support to other teams.
Key Responsibilities
- Develop and implement reliability engineering strategies and practices to ensure high availability and performance of complex systems.
- Work closely with cross-functional teams to identify and mitigate potential reliability risks.
- Develop and maintain reliability models to predict and prevent system failures.
- Provide technical guidance and support to other teams on reliability engineering best practices.
- Collaborate with other teams to develop and implement solutions to improve system reliability.
Requirements
- 10+ years of experience in reliability engineering or a related field.
- Strong understanding of reliability engineering principles and practices.
- Experience with Python, Node.js, and AWS.
- Ability to work in a fast-paced environment and prioritize multiple tasks and projects.
- Strong communication and collaboration skills.