Site Reliability Engineer SRE

Key Responsibilities

Design and implement full-stack observability by evaluating and improving monitoring and metrics solutions
Lead blameless incident response and post-mortems to enhance system reliability
Mentor engineers in logging, monitoring, and reliability best practices
Define and track KPIs for platform reliability and performance with engineering leadership
Deploy infrastructure updates using Terraform on AWS
Build proofs of concept for logging and metrics across frameworks and languages

Requirements

Bachelor’s degree required
Five years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role
Strong experience with Infrastructure as Code, specifically Terraform
Hands-on experience managing cloud infrastructure in AWS
Knowledge of monitoring, logging, and observability tools

View Assessment Process