logo

ai71

Data Engineer GenAI

Department
Engineering
Job Type / Location
London
Experience Required
3+ years
Posted On

Role Overview [UAE Based]

We are seeking a skilled Data Engineer to join our Generative AI team. You will play a critical role in designing, building, and maintaining robust data pipelines and infrastructure to support the development, training, and deployment of cutting-edge Generative AI models. This role requires a blend of technical expertise, problem-solving skills, and a passion for working with data at scale.

Key Responsibilities

  • Data Infrastructure Development: Design, implement, and maintain scalable data pipelines and ETL processes to support Generative AI applications.
  • Data Preparation: Collaborate with AI researchers and data scientists to preprocess, clean, and transform large datasets for training and evaluation of AI models.
  • Model Integration: Support the deployment and monitoring of Generative AI models, ensuring efficient data flow and integration with production systems.
  • Database Management: Optimize and manage data storage solutions, ensuring high availability, security, and performance.
  • Automation: Develop tools and scripts to automate data workflows, monitoring, and reporting processes.
  • Collaboration: Work closely with cross-functional teams, including AI researchers, software engineers, and product managers, to meet project goals.
  • Data Governance: Ensure compliance with data privacy and security regulations, and implement best practices for data quality and lineage.

Qualifications

Must-Have:

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
  • Proven experience in building and maintaining large-scale data pipelines and infrastructure.
  • Proficiency in programming languages such as Python, Scala, or Java.
  • Hands-on experience with big data technologies (e.g., Hadoop, Spark, Kafka).
  • Strong knowledge of SQL and NoSQL databases (e.g., PostgreSQL, MongoDB).
  • Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and data-related services (e.g., S3, Redshift, BigQuery).
  • Understanding of Generative AI concepts and familiarity with frameworks like TensorFlow, PyTorch, or Hugging Face.

Nice-to-Have:

  • Experience working with unstructured data (e.g., text, images, audio) for AI applications.
  • Knowledge of MLOps practices and tools (e.g., MLflow, Kubeflow, Docker).
  • Familiarity with version control systems (e.g., Git) and CI/CD pipelines.
  • Experience with real-time data processing and streaming technologies.
  • Contributions to open-source projects in AI or data engineering.

View Assessment Process

Think you'll be a good fit?