Key Responsibilities
- Design and maintain scalable data pipelines and infrastructure
- Optimize system reliability, performance, and cost efficiency
- Implement monitoring, alerting, and incident response systems
- Collaborate with data teams to ensure data integrity and availability
- Automate deployment and operational workflows
- Troubleshoot and resolve complex system issues
Requirements
- 5+ years in data engineering or site reliability engineering
- Proficiency in Python and SQL
- Experience with cloud platforms (AWS/GCP/Azure)
- Knowledge of containerization and orchestration (Docker, Kubernetes)
- Strong problem-solving and debugging skills