logo

Site Reliability Engineer (SRE)

Additional Verification Required - Site Reliability Engineer (SRE)

Department
Engineering
Job Type / Location
remote
Experience Required
5+ years
Posted On

Key Responsibilities

  • Define and operationalize Service Level Objectives (SLOs) and error budgets in production environments
  • Design and implement reliability-focused monitoring, alerting, and incident response systems
  • Conduct chaos engineering experiments to identify and mitigate system vulnerabilities
  • Collaborate with development teams to improve system resilience and observability
  • Optimize infrastructure performance and cost-efficiency through automation and best practices
  • Participate in on-call rotations and post-mortem analysis to drive continuous improvement

Requirements

  • 3+ years of experience in site reliability engineering or production operations
  • Hands-on experience with SLOs, error budgets, and reliability metrics
  • Familiarity with chaos engineering tools and methodologies
  • Strong scripting and automation skills (Python, Bash, etc.)
  • Experience with cloud platforms (AWS, GCP, Azure) and container orchestration

View Assessment Process

Think you'll be a good fit?