Key Responsibilities
- Design and maintain scalable cloud infrastructure on AWS
- Implement and optimize Kubernetes clusters for AI workloads
- Develop infrastructure-as-code using Terraform and related tools
- Establish monitoring and alerting systems for infrastructure health
- Automate deployment pipelines and infrastructure provisioning
- Collaborate with security teams to maintain compliance standards
Requirements
- 5+ years in cloud infrastructure or DevOps engineering
- Expertise in Kubernetes, Docker, and container orchestration
- Strong AWS experience with EC2, EKS, and related services
- Proficiency in Terraform, Ansible, or similar IaC tools
- Experience with monitoring tools (Prometheus, Grafana)