About the Role
We are seeking a Principal Machine Learning Engineer (Tech Lead Manager) to lead ABBYY’s Document AI Data team, one of the company’s most strategic and high-impact engineering groups.
This role combines hands-on technical leadership with people management, owning both the architecture and roadmap for how ABBYY builds high-quality training data at scale, as well as the growth and performance of the team delivering it.
You will operate at the center of ABBYY’s document AI strategy—defining how training data is created, validated, and scaled to power next-generation large language and vision-language models.
Key Responsibilities
Technical Leadership & Strategy
- Own the end-to-end technical strategy for the Document AI data platform, spanning:
- AI-assisted annotation
- Synthetic data generation
- Document understanding pipelines
- Define architectural principles that unify multiple data workflows into a scalable, cohesive platform
- Establish and operationalize standards for high-quality training data in collaboration with Modeling teams
- Drive the development of data quality evaluation frameworks, including metrics for coverage, fidelity, and performance
- Identify and evaluate emerging AI technologies to maintain ABBYY’s competitive edge
- Make hands-on technical contributions to critical architectural and pipeline decisions
Team & People Leadership
- Lead, mentor, and grow a team of Senior Machine Learning Engineers
- Own hiring strategy and execution, including role definition, interview processes, and offer decisions
- Drive performance management, career development, and growth planning
- Foster a culture of technical rigor, curiosity, and collaboration
- Represent team priorities, roadmap, and resourcing needs to senior leadership
- Build strong partnerships with peer leaders across Modeling, Platform, and Data Operations teams
Cross-Functional Alignment & Delivery
- Partner with Platform teams on model hosting and inference requirements for large-scale data workflows
- Collaborate with Modeling teams to translate model training needs into data strategies and priorities
- Work with Data Operations to build feedback loops between automated annotation and human validation
- Own delivery accountability, including roadmap planning, milestone tracking, and escalation management
- Champion best practices for data privacy, compliance, and responsible AI across all data processes
Qualifications
Education & Experience
- MS or PhD in Computer Science, Engineering, Mathematics, or related field
- 10+ years of experience in Machine Learning / AI, with focus on:
- Large Language Models (LLMs)
- Vision-Language Models (VLMs)
- Large-scale data systems
- Proven track record as both a technical leader and people manager
- Experience building and scaling AI-driven data pipelines in production
- Demonstrated success hiring and developing senior engineering talent
Technical Expertise
- Deep expertise in LLMs and VLMs, including prompting, fine-tuning, and evaluation for structured tasks
- Strong understanding of training data quality principles (distribution, diversity, and validation)
- Proven ability to architect large-scale data platforms processing millions of documents
- Strong programming skills in Python with experience in PyTorch or similar frameworks
- Experience with cloud platforms, MLOps tooling, and pipeline orchestration
- Familiarity with document AI systems, layout analysis, and real-world document variability
Leadership & Communication
- Proven ability to lead and inspire high-performing engineering teams
- Strong track record of making long-term architectural decisions
- Excellent cross-functional collaboration with Engineering, Product, and Operations
- Ability to translate complex technical tradeoffs into clear strategic direction
- Experience building teams in ambiguous, fast-scaling environments