Key Responsibilities
- Design, test, and optimize system prompts and feature-specific prompts to shape AI behavior across products
- Develop and maintain comprehensive evaluation suites to ensure model quality and consistency
- Collaborate with product and research teams to align new features with quality and safety standards
- Support model releases by identifying regressions and ensuring smooth rollouts
- Build frameworks and tools to enable teams to develop and test prompts with confidence
- Mentor engineers on prompt engineering best practices and evaluation methodologies
Requirements
- 5+ years of software engineering experience with Python or similar languages
- Demonstrated experience with LLMs and prompt engineering through work, research, or projects
- Strong understanding of evaluation methodologies and metrics for AI systems
- Excellent written and verbal communication skills to explain complex model behaviors
- Ability to manage multiple projects and prioritize effectively in a fast-paced environment