About the Role

As a Senior Machine Learning Engineer (Model Training & Evaluation) at ABBYY, you will own the end-to-end training and evaluation cycle for our document AI models. Working closely with the Principal Machine Learning Engineer, you will transform research direction into reliable, reproducible, and scalable experimentation pipelines, ensuring model improvements are measurable and production-ready. This role is ideal for engineers who thrive at the intersection of applied ML research and production-grade engineering, combining deep technical expertise with strong experimental rigor.

Key Responsibilities

Training Pipeline & Experimentation

Own the end-to-end training pipeline, including data ingestion, orchestration, checkpointing, and result logging.
Execute large-scale experiments with strong emphasis on reproducibility and traceability.
Investigate training instabilities, loss anomalies, and performance gaps, providing structured analysis and hypotheses.
Implement and validate new optimization techniques and training objectives in collaboration with senior ML leadership.
Continuously improve pipeline efficiency to reduce iteration time while maintaining experiment quality.
Manage compute resources across parallel experiments, balancing throughput and cost efficiency.

Evaluation & Benchmarking

Design and maintain comprehensive evaluation and benchmarking frameworks.
Define clear success metrics across accuracy, latency, memory usage, and domain coverage.
Build automated evaluation pipelines to detect regressions across model checkpoints.
Analyze results to identify patterns in model performance and quality trade-offs.
Partner with Data teams to ensure improvements in training data translate to measurable gains.
Maintain and evolve benchmarking methodologies aligned with industry best practices.

Infrastructure & Collaboration

Partner with Platform Engineering on distributed training infrastructure and experiment tracking systems.
Develop internal tooling to support model analysis and research workflows.
Contribute to team standards around reproducibility, experiment tracking, and documentation.
Collaborate with Platform teams to support model deployment, optimization, and serving.

Qualifications

Education & Experience

MS or PhD in Computer Science, Engineering, Mathematics, or related field.
5+ years of experience in Machine Learning, Applied AI, or related areas.
Proven experience training and evaluating large-scale language and/or vision-language models.
Strong background in building evaluation frameworks and benchmarking systems.
Experience with model optimization or efficient training techniques.

Technical Expertise

Deep understanding of model optimization and compression (e.g., quantization, pruning).
Strong proficiency in Python and PyTorch, including distributed training frameworks (e.g., DeepSpeed, FSDP).
Experience managing large-scale training runs (job scheduling, checkpointing, fault tolerance).
Expertise in evaluation methodology and benchmark design.
Experience with experiment tracking and reproducibility practices.
Familiarity with vision-language model architectures and document AI challenges.

Leadership & Communication

Proven ability to independently own complex technical workstreams.
Strong collaboration skills in cross-functional, research + engineering environments.
Rigorous problem-solving approach with focus on root cause analysis.
Clear and concise communication of technical findings and experimental results.

Senior Machine Learning Engineer, Model Training & Evaluation