Key Responsibilities

Design, develop, and maintain scalable ETL processes using Talend, Informatica, and scripting languages like Python and Bash
Build and manage robust data pipelines with Hadoop, Spark, Apache Hive, Azure Data Lake, and AWS services to process large volumes of structured and unstructured data
Develop and optimize complex SQL queries for data extraction, transformation, and loading across multiple relational databases including Microsoft SQL Server and Oracle
Architect and implement efficient data models for data warehouses to support analytics and reporting initiatives
Collaborate with data scientists to prepare clean datasets and integrate machine learning workflows
Monitor system performance, troubleshoot issues, and implement improvements to ensure high availability of data services

Requirements

Extensive experience with cloud platforms such as AWS (including S3) and Azure Data Lake
Strong programming skills in Java, Python, VBA, Bash, and Shell Scripting for automation tasks
Proficiency with big data technologies including Hadoop ecosystem, Spark (PySpark), and Apache Hive
Expertise in ETL development using Talend, Informatica, or similar tools; strong SQL skills for complex query development
Experience with RESTful API integration and modern data architecture concepts including Data Warehouse design and database modeling

Senior Data Engineer

View Assessment Process