Key Responsibilities
- Design, build, and maintain enterprise-scale AI platforms with a focus on scalability and reliability
- Optimize platform performance for high-throughput, low-latency AI workloads
- Collaborate with cross-functional teams to integrate AI capabilities into core products
- Implement robust monitoring, logging, and alerting systems for platform health
- Drive best practices for AI infrastructure, including security and compliance
- Mentor engineers and contribute to architectural decisions for AI platforms
Requirements
- Minimum of 6 years of relevant work experience in AI or platform engineering
- 5+ years building and maintaining enterprise-scale platforms with proven scalability and reliability
- Deep expertise in AI infrastructure, including model serving, data pipelines, and compute optimization
- Proficiency in cloud platforms (AWS/GCP/Azure) and containerization technologies (Docker, Kubernetes)
- Strong problem-solving skills and experience with distributed systems