We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. As an SRE, you will be responsible for ensuring the reliability and scalability of our technical systems. You will work closely with our engineering team to design, implement, and operate scalable and highly available systems.
Key Responsibilities
- Design, implement, and operate scalable and highly available systems
- Collaborate with the engineering team to identify and prioritize technical projects
- Develop and maintain monitoring and alerting systems to ensure system reliability
- Work with cross-functional teams to resolve production issues and implement changes
- Contribute to the development of our SRE team's best practices and standards
Requirements
- 5+ years of experience in SRE or a related field
- Strong understanding of system design, scalability, and reliability
- Proficiency in Python, Node.js, and AWS
- Experience with machine learning and data analysis
- Excellent communication and collaboration skills