logo

ai71

LLM Engineer

Department
Engineering
Job Type / Location
Abu Dhabi
Experience Required
5+ years
Posted On

About Us

AI71 is an applied research team dedicated to creating helpful and responsible AI agents for knowledge workers. Working closely with our industry partners, our cross-functional teams of AI experts build products grounded in the cutting-edge research of our colleagues from the Technology Innovation Institute (TII).

About the Role

As a Senior LLM Engineer, you will be responsible for the end-to-end development, optimization, and deployment of large language models. You'll work on challenging problems at the intersection of deep learning, natural language processing, and distributed computing.

What You'll Do

  • Analyze large and complex datasets to extract meaningful insights and inform data-driven decision-making.
  • Develop, train, and deploy predictive models to enhance the capabilities of our AI solutions.
  • Collaborate with cross-functional teams to understand business objectives and translate them into actionable data science tasks.
  • Design and implement advanced LLM architectures, including transformer-based models and their variants.
  • Develop novel attention mechanisms and positional encoding schemes.
  • Experiment with model scaling techniques and efficient architectures (e.g., MoE, sparse transformers).
  • Continuously evaluate and improve existing models based on real-world performance and evolving business needs.
  • Implement and optimize distributed training pipelines for large-scale models.
  • Develop strategies for efficient fine-tuning, including parameter-efficient techniques (e.g., LoRA, prefix tuning).
  • Apply advanced optimization techniques such as mixed-precision training and gradient accumulation.
  • Optimize models for inference, including quantization and pruning techniques.
  • Implement efficient serving solutions for real-time inference.
  • Develop strategies for model compression and knowledge distillation.
  • Develop task-specific algorithms for applications such as text classification, named entity recognition, and question-answering.
  • Work with MLOps teams to design and maintain training and serving infrastructure.

What You'll Bring

  • 5+ years of experience in deep learning and NLP, with a focus on large language models.
  • Master's or Ph.D. in Data Science, Statistics, Computer Science, or a related field.
  • Expert-level proficiency in Python and at least one deep learning framework (PyTorch, TensorFlow, or JAX).
  • Strong understanding of transformer architectures, attention mechanisms, and recent advancements in LLMs.
  • Experience with distributed training frameworks (e.g., DeepSpeed, Megatron-LM).
  • Proficiency in optimizing model performance using techniques like mixed-precision training, gradient checkpointing, and model parallelism.
  • Understanding of NLP algorithms such as tokenization, parsing, and semantic analysis.
  • Experience with sequence-to-sequence models and self-supervised learning techniques.
  • Experience with both SQL and NoSQL databases for managing training data and model artifacts.

View Assessment Process

Think you'll be a good fit?