logo

Dataiku

Data Engineer

Department
Engineering
Job Type / Location
Berlin
Experience Required
2+ years
Posted On

About the Role

Dataiku is looking for a Data Engineer to join our Enterprise Data and Analytics (EDA) team. As a member of the EDA team, you will play a central role in delivering data to fuel analytics and data-driven insights to various stakeholders and teams within the company. You will also be a key technical member contributing to the data platform that fuels centralized analytics, embedded analytics teams, Generative AI engineering, and self-service users across the organization.

This role is about 50% Data Operations, Support & Troubleshooting, and 50% new development. The data engineering day-to-day will primarily be within the data platform built using Snowflake, Dataiku, and GitHub. Primary development will focus on Python & SQL, DataOps processes built within GitHub Actions & Dataiku, and data platform processes built within Snowflake & Dataiku.

Non-technical skills and learning are also critical, as you will collaborate with engineers from various teams and help deliver solutions across a wide variety of technical domains. The ideal candidate is naturally curious, has excellent verbal and written communication skills, a sharp analytical mind, a positive attitude towards work, and thrives when collaborating towards a shared goal.

This is an internal and non-client facing role.

What you’ll do:

Dataiku is unique in that every Dataiker is encouraged to use our own product within our Enterprise Data Platform. That means this is a unique opportunity to deliver a scalable platform with governed data to fuel an entire company of current or potential Data Analysts & Data Consumers! Your responsibilities within the team include but are not limited to:

  • Develop engineering expertise within the Dataiku Platform to help maintain and develop system integrations, platform automations, and platform configurations.
  • Develop engineering expertise within Snowflake for data engineering and security/governance features.
  • Build & maintain python & SQL data replication & data pipelines on large & often complex data sets.
  • Build & maintain data quality metrics & observability to help drive data quality standards.
  • Learn about existing systems and processes across Data Platforms, Data Engineering and Data Governance.
  • Troubleshoot data pipelines, platform automations, data access system.
  • Help field and troubleshoot various community questions and challenges.
  • Own, maintain and enhance data operation processes, monitoring & data quality systems.
  • Design data models for both short term and long term use cases to support data warehouse scalability.
  • Build & maintain administration systems and applications for monitoring, alerting, data observability, access management, platform metrics, and end user transparency.
  • Identify opportunities for improvements & optimization for greater scalability & delivery velocity.
  • Collaborate closely with Analytics Engineers to provide data & data models for analytical deliverables.
  • Perform root cause analysis on often complex errors to help ensure data pipeline availability.
  • Help test new features in Dataiku and partner tools to both provide feedback internally as well as determine value towards internal analytics & data platform integration.
  • Work closely with key stakeholders across the organization including Infra, embedded analytics teams, Product and Engineering to help foster both technical implementations & requirements gathering.
  • Proactively drive innovation internally with bringing ideas for platform and process improvements.
  • Help contribute to the ongoing documentation of internal systems and processes.

Requirements:

  • 2+ years of relevant experience in Data Engineering / Data Platform Engineering.
  • Strong technical skills in SQL & Python are a must. Experience in Dataiku DSS is a big plus.
  • Prior experience with Snowflake a plus.
  • Prior experience with DevOps technologies such as Github Actions, Azure DevOps or Jenkins.
  • Experience in building data models.
  • Prior experience building and maintaining replication & data pipelines in a cloud data warehouse or data lake environment.
  • Excellent analytical and creative problem-solving skills - exhibit confidence to ask questions to bring clarity, share ideas, and challenge the norm.
  • Passion for continuous learning and teaching to help learn & teach new technologies & implementation strategies.
  • Experience working with complex stakeholders; dissecting vague asks and helping to define tangible requirements.
  • Ability to manage multiple projects and time constraints simultaneously in a high-trust remote environment.
  • Ability to wear multiple hats depending on the project with the focus on accomplishing end goals while inspiring colleagues to do the same.
  • Excellent written and verbal communication skills (especially with senior-level stakeholders) with the ability to speak to both the business value, data products, & technical capabilities of a platform. Ability to create clear and concise documentations with a high degree of precision.

View Assessment Process

Think you'll be a good fit?