Work Partners CHANGE About us Blog Events Careers Top 3%Apply

Deepinfra Inc.

Inference Engineer

Department: Engineering
Job Type / Location: remote
Experience Required: 4+ years
Posted On: May 13, 2026

Key Responsibilities

Develop high-performance inference engines for AI models
Optimize model architectures for low-latency and high-throughput inference
Implement GPU-accelerated computing solutions
Collaborate with ML teams to integrate optimized models into production systems
Profile and benchmark inference performance
Ensure compatibility across diverse hardware platforms

Requirements

2+ years of experience in systems programming or AI inference
Strong proficiency in C++ and Python
Experience with GPU computing (CUDA/OpenCL) and model optimization
Knowledge of neural network architectures and performance tuning
Familiarity with Linux and performance profiling tools

View Assessment Process

Think you'll be a good fit?