About the Role

We are seeking a Principal Machine Learning Engineer (Tech Lead Manager) to lead ABBYY’s Document AI Data team, one of the company’s most strategic and high-impact engineering groups.

This role combines hands-on technical leadership with people management, owning both the architecture and roadmap for how ABBYY builds high-quality training data at scale, as well as the growth and performance of the team delivering it.

You will operate at the center of ABBYY’s document AI strategy—defining how training data is created, validated, and scaled to power next-generation large language and vision-language models.

Key Responsibilities

Technical Leadership & Strategy

Own the end-to-end technical strategy for the Document AI data platform, spanning:
- AI-assisted annotation
- Synthetic data generation
- Document understanding pipelines
Define architectural principles that unify multiple data workflows into a scalable, cohesive platform
Establish and operationalize standards for high-quality training data in collaboration with Modeling teams
Drive the development of data quality evaluation frameworks, including metrics for coverage, fidelity, and performance
Identify and evaluate emerging AI technologies to maintain ABBYY’s competitive edge
Make hands-on technical contributions to critical architectural and pipeline decisions

Team & People Leadership

Lead, mentor, and grow a team of Senior Machine Learning Engineers
Own hiring strategy and execution, including role definition, interview processes, and offer decisions
Drive performance management, career development, and growth planning
Foster a culture of technical rigor, curiosity, and collaboration
Represent team priorities, roadmap, and resourcing needs to senior leadership
Build strong partnerships with peer leaders across Modeling, Platform, and Data Operations teams

Cross-Functional Alignment & Delivery

Partner with Platform teams on model hosting and inference requirements for large-scale data workflows
Collaborate with Modeling teams to translate model training needs into data strategies and priorities
Work with Data Operations to build feedback loops between automated annotation and human validation
Own delivery accountability, including roadmap planning, milestone tracking, and escalation management
Champion best practices for data privacy, compliance, and responsible AI across all data processes

Qualifications

Education & Experience

MS or PhD in Computer Science, Engineering, Mathematics, or related field
10+ years of experience in Machine Learning / AI, with focus on:
- Large Language Models (LLMs)
- Vision-Language Models (VLMs)
- Large-scale data systems
Proven track record as both a technical leader and people manager
Experience building and scaling AI-driven data pipelines in production
Demonstrated success hiring and developing senior engineering talent

Technical Expertise

Deep expertise in LLMs and VLMs, including prompting, fine-tuning, and evaluation for structured tasks
Strong understanding of training data quality principles (distribution, diversity, and validation)
Proven ability to architect large-scale data platforms processing millions of documents
Strong programming skills in Python with experience in PyTorch or similar frameworks
Experience with cloud platforms, MLOps tooling, and pipeline orchestration
Familiarity with document AI systems, layout analysis, and real-world document variability

Leadership & Communication

Proven ability to lead and inspire high-performing engineering teams
Strong track record of making long-term architectural decisions
Excellent cross-functional collaboration with Engineering, Product, and Operations
Ability to translate complex technical tradeoffs into clear strategic direction
Experience building teams in ambiguous, fast-scaling environments

Principal Machine Learning Engineer, Document AI Data (Tech Lead Manager)