Role Overview
We are seeking a highly skilled AI Engineer – Generative AI & Cloud Infrastructure to design, build, and optimize large-scale LLM-based systems and cloud-native architectures. The ideal candidate will bring a strong background in Generative AI workflows (RAG, multi-agent systems), AWS cloud engineering, and secure, production-grade deployment of AI applications.
This is a hands-on engineering role that blends AI model orchestration, API development, and cloud infrastructure management, ensuring enterprise-grade scalability, security, and reliability.
Key Responsibilities
- Implement and manage LLM workflows such as Retrieval-Augmented Generation (RAG) and multi-agent pipelines.
- Orchestrate model interactions using frameworks like LangChain, LangGraph, or custom-built orchestration layers.
- Develop and maintain APIs for LLM endpoints, vector search, and tool-augmented responses.
- Integrate external tools (search engines, calculators, APIs) and manage agent memory (episodic and long-term).
- Build robust data ingestion pipelines and document loaders for diverse formats (PDF, HTML, slides, etc.).
- Implement efficient chunking and embedding strategies for retrieval optimization.
- Set up and maintain vector databases with metadata indexing and optimized search strategies.
- Ensure low-latency inference and high concurrency via caching, batching, and streaming.
- Monitor and optimize model performance for project-specific use cases.
- Utilize observability tools such as LangFuse, LangSmith, Arize, and PromptLayer to track LLM behavior and quality.
- Securely log prompts, responses, and monitor model quality and performance metrics.
- Implement security best practices—authentication, authorization, rate limiting, and API usage tracking (OpenAI, Bedrock, API Gateway).
- Apply enterprise safety frameworks for responsible AI deployment.
- Design and manage scalable, secure, and highly available cloud architectures using AWS services (EC2, S3, VPC, RDS, Lambda, SageMaker, OpenSearch, Bedrock, DynamoDB).
- Write and maintain application code integrating AWS SDKs and APIs.
- Configure networking components such as ALBs, Auto Scaling Groups, and Route 53.
- Deploy and scale workloads on ECS/EKS, with full observability and cost optimization.
- Troubleshoot infrastructure issues, optimize cloud performance, and ensure compliance with security standards.
Required Skills & Experience
- 10+ years of experience in AI Engineering, ML Ops, or Cloud Infrastructure roles.
- Strong programming skills in Python (preferred) or similar languages.
- Hands-on experience with LLM frameworks and GenAI orchestration tools.
- Deep expertise in AWS Cloud services and architecture design.
- Familiarity with observability tools for AI systems (LangFuse, LangSmith, etc.).
- Strong understanding of secure software development and enterprise-grade deployment.
- Proven ability to scale AI/ML workloads in production environments.