Key Responsibilities
- Design, build, and maintain scalable data pipelines to support real-time and batch processing
- Optimize data storage and retrieval systems for performance and cost efficiency
- Collaborate with cross-functional teams to define data requirements and deliver actionable insights
- Implement robust data governance and quality control measures
- Develop and deploy machine learning models for predictive analytics
- Monitor system health and troubleshoot data-related issues
Requirements
- 5+ years of experience in data engineering or related fields
- Proficiency in Python and SQL with experience in ETL processes
- Strong understanding of data modeling and database design
- Experience with cloud platforms (AWS, GCP, or Azure) and big data tools (Spark, Kafka)
- Familiarity with CI/CD pipelines and infrastructure as code