About The Role

Rapid7 is seeking a Principal AI Engineer to join our team as we expand and evolve our growing AI and MLOps efforts. You should have a strong foundation in applied AI R&D, software engineering, and MLOps and DevOps systems and tools. Further, you’ll have a demonstrable track record of taking models created in the AI R&D process to production with repeatable deployment, monitoring and observability patterns. In this intersectional role, you will deftly combine your expertise in AI/ML deployments, cloud systems and software engineering to enhance our product offerings and streamline our platform's functionalities.

In This Role, You Will

Interdisciplinary Collaboration

Collaborate closely with engineers and researchers to refine key product and platform components, aligning with both user needs and internal objectives.
Actively contribute to cross-functional teams, focusing on the successful building and deployment of AI applications.

Data Pipeline Construction and Lifecycle Management

Develop and maintain data pipelines, manage the data lifecycle, and ensure data quality and consistency throughout.

Feature Engineering and Resource Management

Oversee feature engineering processes and optimize resources for both offline and online inference requests.

Model Development, Validation, and Maintenance

Build, validate, and continuously improve machine learning models, manage concept drift, and ensure the reliability of deployed systems.

System Design and Project Management

Architect and manage the end-to-end design of ML production systems, including project scoping, data requirements, modeling strategies, and deployment.

Knowledge and Expertise Sharing

Thoroughly document research findings, methodologies, and implementation details.
Share expertise and knowledge consistently with internal and external stakeholders, nurturing a collaborative environment.

ML Deployment

Implement, monitor, and manage ML services and pipelines within an AWS environment, employing tools such as Sagemaker and Terraform.
Assure robust implementation of ML guardrails, leveraging frameworks like NVIDIA NeMo, and managing all aspects of service monitoring.
Develop and deploy accessible endpoints, including web applications and REST APIs, while maintaining steadfast data privacy and adherence to security best practices and regulations.

Software Engineering

Lead the development of core API components to enable interactions with LLMs.
Craft and optimize conversational interfaces, capitalizing on the capabilities of LLMs.
Conduct API and interface optimization with a product-focused approach, ensuring performance, robustness, and user accessibility are paramount.

Continuous Improvement

Embrace agile development practices, valuing constant iteration, improvement, and effective problem-solving in complex and ambiguous scenarios.

The Skills You’ll Bring Include

Expertise in both ML deployment (especially in AWS) and software engineering.
Experience as a software engineer, notably in building APIs and/or interfaces, paired with adept coding skills in Python and TypeScript.
Adeptness in containerization and DevOps.
Exemplary problem-solving capabilities, particularly in decomposing complex problems into manageable parts and devising innovative solutions.
Proficient with CI/CD tooling, Docker, Kubernetes, and prior experience developing APIs with Flask or FastAPI.
Experience deploying LLMs, managing advanced compute resources like GPUs, and navigating data collection for metrics and fine-tuning from LLM-based systems.
Robust analytical, problem-solving, and communication skills, with the capacity to convey intricate ideas effectively.
High standards of engineering hygiene, embracing best practices and an agile development mindset.
A positive, can-do, solution-oriented mindset, welcoming the challenge.

Principal AI Engineer - MLOps