About The Role
Rapid7 is seeking a Principal AI Engineer to join our team as we expand and evolve our growing AI and MLOps efforts. You should have a strong foundation in applied AI R&D, software engineering, and MLOps and DevOps systems and tools. Further, you’ll have a demonstrable track record of taking models created in the AI R&D process to production with repeatable deployment, monitoring and observability patterns. In this intersectional role, you will deftly combine your expertise in AI/ML deployments, cloud systems and software engineering to enhance our product offerings and streamline our platform's functionalities.
In This Role, You Will
Interdisciplinary Collaboration
- Collaborate closely with engineers and researchers to refine key product and platform components, aligning with both user needs and internal objectives.
- Actively contribute to cross-functional teams, focusing on the successful building and deployment of AI applications.
Data Pipeline Construction and Lifecycle Management
- Develop and maintain data pipelines, manage the data lifecycle, and ensure data quality and consistency throughout.
Feature Engineering and Resource Management
- Oversee feature engineering processes and optimize resources for both offline and online inference requests.
Model Development, Validation, and Maintenance
- Build, validate, and continuously improve machine learning models, manage concept drift, and ensure the reliability of deployed systems.
System Design and Project Management
- Architect and manage the end-to-end design of ML production systems, including project scoping, data requirements, modeling strategies, and deployment.
Knowledge and Expertise Sharing
- Thoroughly document research findings, methodologies, and implementation details.
- Share expertise and knowledge consistently with internal and external stakeholders, nurturing a collaborative environment.
ML Deployment
- Implement, monitor, and manage ML services and pipelines within an AWS environment, employing tools such as Sagemaker and Terraform.
- Assure robust implementation of ML guardrails, leveraging frameworks like NVIDIA NeMo, and managing all aspects of service monitoring.
- Develop and deploy accessible endpoints, including web applications and REST APIs, while maintaining steadfast data privacy and adherence to security best practices and regulations.
Software Engineering
- Lead the development of core API components to enable interactions with LLMs.
- Craft and optimize conversational interfaces, capitalizing on the capabilities of LLMs.
- Conduct API and interface optimization with a product-focused approach, ensuring performance, robustness, and user accessibility are paramount.
Continuous Improvement
- Embrace agile development practices, valuing constant iteration, improvement, and effective problem-solving in complex and ambiguous scenarios.
The Skills You’ll Bring Include
- Expertise in both ML deployment (especially in AWS) and software engineering.
- Experience as a software engineer, notably in building APIs and/or interfaces, paired with adept coding skills in Python and TypeScript.
- Adeptness in containerization and DevOps.
- Exemplary problem-solving capabilities, particularly in decomposing complex problems into manageable parts and devising innovative solutions.
- Proficient with CI/CD tooling, Docker, Kubernetes, and prior experience developing APIs with Flask or FastAPI.
- Experience deploying LLMs, managing advanced compute resources like GPUs, and navigating data collection for metrics and fine-tuning from LLM-based systems.
- Robust analytical, problem-solving, and communication skills, with the capacity to convey intricate ideas effectively.
- High standards of engineering hygiene, embracing best practices and an agile development mindset.
- A positive, can-do, solution-oriented mindset, welcoming the challenge.