Overview of the Role
As a Machine Learning Engineer at Kloud9 LLC, you will be responsible for the design and development of Machine Learning Systems, as well as refining and updating existing systems. You will bring best software development practices to the data science team, accelerating their work, and testing machine learning libraries to their extremes, often adding new functionalities. Your role will involve enabling production deployment of code, testability, and accuracy metric tracking. You will constantly look for performance improvement and decide which ML technologies will be used in a production environment.
Key Responsibilities
- Design and develop machine learning algorithms and deep learning applications and systems for Kloud-9.
- Solve complex problems with multilayered data sets, and optimize existing machine learning libraries and frameworks.
- Collaborate with data scientists, administrators, data analysts, data engineers, and data architects on production systems and applications.
- Assess, analyse, and organize large amounts of data, with strong skills in statistics and programming.
- Identify differences in data distribution that could potentially affect model performance in real-world applications.
- Ensure algorithms generate accurate user recommendations.
- Stay up to date with developments in the machine learning industry.
- Implement ML platform capabilities to streamline all phases of data-centric innovation, including data access and exploration, model development, productionisation, testing, and monitoring of machine learning pipelines.
- Design and own end-to-end ML platforms that enable ML Applied Scientists with model and feature pipeline development, deployment, monitoring, and maintenance.
- Build and maintain machine learning and big data production pipelines to support advanced analytics, data science, and AI/ML solutions.
- Identify valuable internal and external data.
- Collaborate closely with data and ML scientists to define data for the design, development, and deployment of new solutions that support strategic business priorities.
- Develop large scale data structures and pipelines to organize, collect and standardize data that helps generate insights and intelligence to support business needs.
Attributes & Competencies
- Proficiency with Python and machine learning libraries such as scikit-learn, matplotlib, seaborn and pandas.
- Knowledge of Big Data frameworks like Hadoop, Spark, Pig, Hive, Flume, etc.
- Experience in working with ML frameworks like TensorFlow, Keras, OpenCV.
- Expertise in visualizing and manipulating big datasets.
- Familiarity with Linux.
- Ability to select hardware to run an ML model with the required latency.
- Robust data modelling and data architecture skills.
- Advanced Math and Statistics skills (linear algebra, calculus, Bayesian statistics, mean, median, variance, etc.).
- Exploring and visualizing data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world.
- Verifying data quality, and/or ensuring it via data cleaning.
- Supervising the data acquisition process if more data is needed.
- Finding available datasets online that could be used for training.
Required Skills
- Python, Flask, Pyspark (Spark-Core, Spark-SQL, Spark-ML).
- Scikit-learn, matplotlib, seaborn, pandas, OpenCV, Keras, Tensorflowm, Scala.
- Jupyter Notebook, Machine Learning, Deep Learning.
- GCP components - Cloud Functions, Cloud Storage, DataProc, Google Kubernetes Engine (GKE), Vertex AI, Compute Engine.
- Kube Flow & Kubernetes.
Experience & Education
- Experience: 3+ Years.
- Education: Bachelor's or Master's in Computer Science, Data Science, Machine Learning, or a related field.