Key Responsibilities
- Design and implement scalable AI systems for real-time inference and training
- Optimize model performance and deployment pipelines for low-latency applications
- Collaborate with cross-functional teams to integrate AI components into production systems
- Develop monitoring and logging frameworks for AI model health and performance
- Research and implement cutting-edge techniques in model quantization and pruning
- Ensure system reliability and scalability under high-throughput workloads
Requirements
- 3+ years of experience building production-grade AI systems
- Proficiency in Python and deep learning frameworks (PyTorch/TensorFlow)
- Experience with distributed computing and cloud infrastructure (AWS/GCP)
- Strong understanding of model optimization techniques
- Familiarity with CI/CD pipelines and MLOps practices