logo

BOLD

Senior Engineer-Devops, Machine Learning Operations

Department
Engineering
Job Type / Location
Noida
Experience Required
4+ years
Posted On

About this Team

The Infrastructure team provides various services including automation, observability, cloud/server/network architectures, CICD, infrastructure as code, database administration, incident management, vendor management, security and compliance, and acquiring new skills. These services help to improve efficiency, reduce errors, and ensure fast and reliable application releases while maintaining security and compliance. Techops helps teams monitor applications and infrastructure, create resilient infrastructure, identify and resolve IT service issues, manage vendors, and ensure cloud security and compliance. The team also focuses on continuous learning and implementing new technologies to provide better value to the organization.

What You’ll Do

  • Design and maintain end-to-end MLOps pipelines for data ingestion, feature engineering, model training, deployment, and monitoring.
  • Productionize ML/GenAI services, collaborating with data scientists on model serving and workflow optimization.
  • Implement monitoring, alerting, and observability to reduce MTTR and ensure production reliability.
  • Manage data/feature stores and search infrastructure for scalable ML inference.
  • Automate CI/CD for ML models and infrastructure with governance and security compliance.
  • Handle security patching, cost optimization, and 24x7 on-call rotations for critical services.
  • Coordinate cross-functionally with development, QA, ops, and data teams to innovate build/deployment processes.

What You’ll Need

  • 4.5+ years (Sr Engineer)/7+ years (Module Lead) in AWS MLOps with hands-on SageMaker (Pipelines, Model Registry, Studio), EMR, and OpenSearch (kNN/vector search).
  • Python/Bash scripting for CI/CD, provisioning, monitoring of FastAPI/Spring Boot web services; and Linux servers (Solr/OpenSearch).
  • AWS services (S3, DynamoDB, Lambda, Step Functions) with cost control, reporting; databases (MySQL, MongoDB).
  • Strong Linux and networking fundamentals.
  • Hands-on expertise in ML tools (MLFlow, Airflow, Metaflow, ONNX).

View Assessment Process

Think you'll be a good fit?