Key Responsibilities
- Design, develop, and maintain scalable ETL/ELT pipelines for ingesting, transforming, and loading data from multiple sources
- Build and optimize data workflows supporting both batch and real-time processing with high availability
- Leverage AWS services (S3, Glue, Lambda, Redshift, EMR, Kinesis) to design secure, scalable, and cost-efficient cloud data architectures
- Optimize data storage, retrieval, and query performance through indexing, partitioning, and lifecycle management
- Ensure data accuracy, consistency, and integrity via validation, monitoring, and error-handling mechanisms
- Collaborate with data scientists, engineers, and stakeholders to integrate solutions with enterprise systems and APIs
Requirements
- Active TS/SCI clearance with polygraph required
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience)
- Strong experience with AWS data services and big data technologies
- Proficiency in SQL and Python for data processing and transformation
- Experience working with large-scale datasets and optimizing data pipeline performance