Key Responsibilities
- Design, build, and maintain scalable data pipelines to support real-time analytics and reporting
- Optimize ETL processes for performance and reliability across distributed systems
- Collaborate with cross-functional teams to define data requirements and deliver actionable insights
- Implement data governance and quality frameworks to ensure accuracy and compliance
- Troubleshoot and resolve data infrastructure issues with minimal downtime
- Develop and maintain documentation for data models, pipelines, and APIs
Requirements
- 5+ years of experience in data engineering or related fields
- Proficiency in Python and SQL with experience in large-scale data processing
- Hands-on experience with ETL tools, data warehousing, and cloud platforms (AWS/GCP/Azure)
- Strong understanding of data modeling, schema design, and API development
- Experience with containerization and orchestration tools (Docker, Kubernetes)