About Us

Cloud202 Limited is a leading technology consulting company dedicated to helping businesses transform and innovate through cutting-edge technology solutions. We specialize in cloud migration, AI/ML, and application development, providing our clients with the expertise they need to stay ahead in a rapidly evolving digital landscape.

Position Overview

We are seeking an innovative AI Engineer to lead the development and implementation of enterprise-grade agentic AI solutions. This role requires deep expertise in the Gen-AI ecosystem, including Amazon Bedrock, Amazon Bedrock AgentCore, SageMaker AI, and emerging AI agent frameworks. The ideal candidate will drive enterprise AI transformation initiatives and build next-generation intelligent applications using cutting-edge agentic platforms and protocols.

Required Qualifications

Experience

Minimum 3+ years of hands-on experience with AWS cloud services and machine learning infrastructure
2+ years of specific experience with generative AI, large language models (LLMs), and foundation models
Proven track record of building and deploying production-scale AI/ML applications on AWS

Certifications

Preferred: AWS Certified AI Practitioner or AWS Machine Learning Specialty

Core Technical Skills

Amazon Bedrock AgentCore Platform (Critical)

AgentCore Runtime: Deploy and operate AI agents securely at scale with serverless infrastructure, session isolation, and support for 8-hour execution windows
AgentCore Memory: Implement intelligent session and long-term memory with episodic learning capabilities for context-aware agent interactions
AgentCore Gateway: Build secure, centralized access to tools and APIs with minimal code transformation
AgentCore Identity: Implement seamless agent authentication across AWS services and third-party applications (Slack, Zoom, GitHub, Salesforce) using OAuth, Okta, Entra, or Amazon Cognito
AgentCore Tools: Utilize Code Interpreter for secure code execution and Browser Tool for enterprise-grade web automation within managed sandbox environments
AgentCore Observability: Implement end-to-end tracing, debugging, and monitoring through unified CloudWatch dashboards with OTEL compatibility
AgentCore Policy: Set fine-grained boundaries on agent actions with real-time deterministic controls
AgentCore Evaluations: Continuously assess agent quality and behavior for production readiness

Gen-AI Services & Foundation Models

Amazon Bedrock: Comprehensive experience with foundation model access, fine-tuning, and deployment
SageMaker AI: Model hosting, endpoints, auto-scaling, A/B testing, and deployment pipelines
Amazon Q Developer: AI-powered development automation and code transformation capabilities
Foundation Models: Hands-on experience with Claude (Anthropic), Llama (Meta), GPT models (OpenAI), Mistral, and Amazon Nova models

AI Agents Development & Frameworks

Strands Agents SDK: Build production-ready AI agents with model-driven approach, supporting single agents, multi-agent systems, and swarm architectures
Framework Expertise: Experience with CrewAI, LangGraph, LlamaIndex, Google ADK, OpenAI Agents SDK, or custom agent frameworks
Multi-Agent Orchestration: Design complex workflows with hierarchical delegation, agent-as-tools patterns, and dynamic capability discovery
Agentic Workflows: Build autonomous agents that reason, plan, use tools, and maintain context across long-running tasks
Tool Integration: Develop custom tools using Python decorators and integrate external APIs and services

Agent Protocols & Interoperability (Essential)

Model Context Protocol (MCP): Implement MCP servers and clients to provide standardized context and tool access to AI agents. Deploy MCP servers in AgentCore Runtime with OAuth authentication
Agent-to-Agent (A2A) Protocol: Build inter-agent communication systems using A2A protocol for peer-to-peer agent collaboration, capability negotiation, and task coordination
Agent Discovery: Implement agent cards and capability manifests for dynamic agent discovery and routing
Protocol Integration: Deploy agents supporting both MCP and A2A protocols for maximum interoperability across enterprise systems

Advanced Technical Skills

Vector Databases: Amazon OpenSearch, Pinecone, or similar for RAG implementations
Programming: Expert-level Python and JavaScript/TypeScript, with focus on AI/ML libraries and async programming
APIs & Integration: RESTful APIs, GraphQL, JSON-RPC 2.0, Server-Sent Events (SSE), real-time streaming, webhook integration
Prompt Engineering: Advanced prompt flows, few-shot learning, chain-of-thought reasoning, and structured output generation
Knowledge Bases: RAG implementation with enterprise data integration and semantic search
Guardrails & Safety: Bedrock Guardrails, content filtering, bias detection, and responsible AI practices
Custom Model Fine-tuning: Adapting foundation models for domain-specific use cases

Advanced GenAI Applications

Retrieval-Augmented Generation (RAG): Enterprise search, document Q&A, knowledge management
Content Generation: Text, image, code, and multimedia content creation
Conversational AI: Chatbots, virtual assistants, customer service automation with memory retention
Code Generation & Analysis: Automated code review, documentation, refactoring, and software modernization
Data Analysis & Insights: Natural language to SQL, automated reporting, business intelligence

Key Responsibilities

Solution Architecture & Design

Design end-to-end generative AI solutions using Amazon Bedrock AgentCore as the primary agentic platform
Architect scalable, cost-effective AI pipelines leveraging AgentCore Runtime for serverless deployment
Implement MCP and A2A protocols for agent interoperability and tool integration
Design multi-agent architectures with proper orchestration, memory management, and observability
Create technical documentation and best practices for AgentCore implementations

Development & Implementation

Build production-ready agentic applications using Amazon Bedrock AgentCore services (Runtime, Memory, Gateway, Identity, Observability)
Develop AI agents using Strands Agents SDK and other framework-agnostic approaches
Implement MCP servers for tool and data access across enterprise systems
Deploy A2A-compliant agents for cross-platform agent collaboration
Implement RAG systems with vector databases and AgentCore Gateway for secure data access
Create automated workflows for model deployment, monitoring, and evaluation
Integrate AI capabilities into existing enterprise applications with proper authentication and governance

Model & Agent Management

Evaluate and select appropriate foundation models for specific use cases
Implement AgentCore Policy for fine-grained control over agent actions and permissions
Use AgentCore Evaluations for continuous quality assessment and optimization
Optimize agent performance, cost, and latency using AgentCore Observability insights
Ensure compliance with data privacy, security requirements, and responsible AI practices

Innovation & Research

Stay current with latest AWS AI service releases, AgentCore capabilities, and agentic AI protocols
Experiment with emerging AI techniques, multi-agent patterns, and protocol enhancements
Prototype new use cases and proof-of-concepts using AgentCore platform

AI Engineer