logo

Machine Learning Infrastructure Engineer

Department
Engineering
Job Type / Location
remote
Experience Required
5+ years
Posted On

Key Responsibilities

  • Build and maintain scalable infrastructure for machine learning model training and deployment
  • Develop CI/CD pipelines for ML model versioning and testing
  • Optimize GPU/TPU resource allocation for training workloads
  • Collaborate with data scientists to streamline model deployment workflows
  • Monitor and troubleshoot infrastructure performance and reliability

Requirements

  • 3+ years of experience in ML infrastructure or related roles
  • Proficiency in containerization (Docker) and orchestration (Kubernetes)
  • Experience with MLOps tools (MLflow, Kubeflow, or similar)
  • Strong scripting skills (Python, Bash) and cloud platform knowledge (AWS/GCP)
  • Familiarity with distributed computing and GPU acceleration

View Assessment Process

Think you'll be a good fit?