About the Role
We are seeking a highly experienced Senior AI/ML Engineer to design, build, and scale enterprise-grade Generative AI solutions on the LightSpeed Enterprise platform. The ideal candidate will have deep hands-on experience with Large Language Models (LLMs), orchestration frameworks, and production deployment of AI systems. This role involves working across the full lifecycle—from model experimentation and fine-tuning to building scalable APIs and deploying robust, monitored systems in production. You will collaborate closely with cross-functional teams to architect intelligent systems powered by advanced retrieval techniques, multi-agent workflows, and modern ML infrastructure.
Responsibilities
- Design, develop, and deploy scalable applications using LLMs (Gemini, Claude, OpenAI models)
- Build and manage orchestration pipelines using LangChain, LlamaIndex, LangGraph, and multi-agent systems
- Develop and optimize Retrieval-Augmented Generation (RAG) pipelines using vector databases and PostgreSQL
- Fine-tune and evaluate ML/DL models using PyTorch and TensorFlow
- Architect end-to-end AI systems, including prompt engineering, evaluation, and monitoring frameworks
- Develop and maintain production-grade APIs and microservices for AI applications
- Containerize and deploy applications using Docker and OpenShift
- Implement monitoring, logging, and performance optimization for production AI systems
- Collaborate with product, engineering, and data teams to translate business requirements into AI solutions
- Stay up to date with advancements in Generative AI and recommend best practices
Requirements
Minimum Requirements
- 7–15 years of experience in software engineering, machine learning, or AI-related roles
- Strong hands-on experience with LLMs such as Gemini, Claude, and OpenAI models
- Proven experience with orchestration frameworks like LangChain, LlamaIndex, LangGraph, or similar
- Solid understanding of ML/DL concepts with practical experience in PyTorch and/or TensorFlow
- Experience building and deploying RAG-based systems using vector databases and PostgreSQL
- Expertise in designing and deploying scalable APIs and microservices
- Hands-on experience with Docker and OpenShift (or similar container orchestration platforms)
- Strong programming skills in Python
- Experience working in production environments with monitoring and logging tools
- Excellent problem-solving and system design skills
Would Be a Plus
- Experience with multi-agent AI systems and autonomous workflows
- Knowledge of advanced retrieval techniques (hybrid search, reranking, semantic search)
- Familiarity with enterprise AI platforms like LightSpeed Enterprise
- Experience with cloud platforms (AWS, Azure, or GCP)
- Exposure to MLOps tools and CI/CD pipelines for ML systems
- Experience with real-time inference systems and streaming pipelines
- Understanding of AI safety, governance, and responsible AI practices
- Contributions to open-source AI/ML projects or research publications