logo

Mindshift Analytics

Data Scientist

Department
Engineering
Job Type / Location
onsite
Experience Required
0+ years
Posted On

About the Role

Mindshift Analytics is looking for a Data Scientist to join our team. This is an Entry-Level, Full-time position focused on Engineering and Information Technology within the Software Development industry.

Responsibilities and Skills

The ideal candidate will possess a strong set of skills related to data management, analysis, and visualization. Key responsibilities and required skills include:

  • Data Ingestion: Implement and manage data ingestion from IoT sensors, ensuring efficient and reliable data flow.
  • ETL Pipeline Management: Design and maintain ETL pipelines using MQTT, REST API, and TCP socket programming, ensuring data integrity.
  • Data Munging: Perform data transformation and preparation using either R or Python, with a working knowledge of the second language.
  • UI/UX Development for Data Visualization: Develop web server-based UI/UX for data visualization, preferably using R Shiny, creating interactive visual tools.
  • Inter-working of Python and R: Integrate and leverage functionalities between Python and R environments for seamless data processing.
  • Time Series Data Analysis: Conduct time series data classification to identify patterns and anomalies. Develop algorithms for classification problems in temporal data.
  • API Endpoint Creation: Develop API endpoints for data access and integration, ensuring secure and efficient data exchange with partners.
  • Custom Report Development: Generate custom reports from diverse raw data sources, tailored to specific client needs. Experience with interactive report generation.
  • Data Cleaning: Implement data cleaning techniques, including spike removal and noise reduction, to ensure data quality.
  • Data Pipeline Management: Oversee the data pipeline lifecycle, from ingestion to visualization, focusing on efficiency and scalability. Proficiency in MariaDB / MySQL for database management and querying.
  • Docker / Kubernetes: Deploying and managing containerized applications at scale in cloud environments.

Preferred Experience

Experience with at least one of the following big data technologies will be highly preferred:

  • Databricks Interface with R: Databricks native integration with Posit Workbench. Databricks clusters and the Unity Catalog via the sparklyr and pysparklyr packages.
  • RHIPE: R with Hadoop for big data analytics using MapReduce.
  • Apache Spark with SparkR: Scalable data processing framework with an R interface for large-scale data analysis.
  • DBI (Database Interface) with R: Communication between R and various relational database management systems.
  • Google BigQuery

View Assessment Process

Think you'll be a good fit?