Key Responsibilities
- Design and implement scalable data pipelines for large datasets
- Develop ETL processes to transform raw data into actionable insights
- Optimize database schemas and query performance for analytical workloads
- Implement data quality checks and monitoring systems
- Collaborate with data scientists to support machine learning initiatives
- Maintain data infrastructure and ensure system reliability
Requirements
- 5+ years of Python development experience
- Strong SQL skills with experience in database optimization
- Knowledge of data pipeline tools like Airflow or Luigi
- Experience with cloud data services and big data technologies
- Understanding of data modeling and warehouse design principles