logo

Deepinfra Inc.

Inference Engineer

Department
Engineering
Job Type / Location
remote
Experience Required
4+ years
Posted On

Key Responsibilities

  • Develop high-performance inference engines for AI models
  • Optimize model architectures for low-latency and high-throughput inference
  • Implement GPU-accelerated computing solutions
  • Collaborate with ML teams to integrate optimized models into production systems
  • Profile and benchmark inference performance
  • Ensure compatibility across diverse hardware platforms

Requirements

  • 2+ years of experience in systems programming or AI inference
  • Strong proficiency in C++ and Python
  • Experience with GPU computing (CUDA/OpenCL) and model optimization
  • Knowledge of neural network architectures and performance tuning
  • Familiarity with Linux and performance profiling tools

View Assessment Process

Think you'll be a good fit?