Key Responsibilities

Design and optimize scalable deployment pipelines for large language models (LLMs) and AI/ML workloads
Implement performance testing frameworks using JMeter, LoadRunner, and Gatling to validate system scalability and reliability
Develop and maintain observability stacks with Splunk, Kibana, Prometheus, and Grafana for real-time monitoring and debugging
Collaborate with cross-functional teams to integrate AI/ML models into production environments
Automate CI/CD workflows using Jenkins, GitHub Actions, and GitLab CI to ensure seamless deployments
Optimize cloud infrastructure on AWS, Azure, or GCP for cost-efficiency and performance

Requirements

10+ years of experience in performance engineering, testing, and AI/ML deployment
Hands-on expertise with LLM frameworks (OpenAI, LangChain, Hugging Face) and performance tools
Strong programming skills in Python, Java, or Go
Proven experience with cloud platforms and observability tools
Deep understanding of CI/CD pipelines and automation

LLM Deployment Engineer

View Assessment Process