Key Responsibilities
- Design and optimize scalable deployment pipelines for large language models (LLMs) and AI/ML workloads
- Implement performance testing frameworks using JMeter, LoadRunner, and Gatling to validate system scalability and reliability
- Develop and maintain observability stacks with Splunk, Kibana, Prometheus, and Grafana for real-time monitoring and debugging
- Collaborate with cross-functional teams to integrate AI/ML models into production environments
- Automate CI/CD workflows using Jenkins, GitHub Actions, and GitLab CI to ensure seamless deployments
- Optimize cloud infrastructure on AWS, Azure, or GCP for cost-efficiency and performance
Requirements
- 10+ years of experience in performance engineering, testing, and AI/ML deployment
- Hands-on expertise with LLM frameworks (OpenAI, LangChain, Hugging Face) and performance tools
- Strong programming skills in Python, Java, or Go
- Proven experience with cloud platforms and observability tools
- Deep understanding of CI/CD pipelines and automation