Opportunity
As a Senior AI QA Engineer for our Precision Patient Care Pipeline, you will go beyond traditional functional testing. You will be responsible for building the framework that ensures our clinical insights are accurate, safe, and reliable. This role requires a unique blend of high-level software testing and data engineering to validate complex, non-deterministic medical outputs using a hybrid of automated grading methodologies.
Key Responsibilities
- Architect Multi-Layered Validation Frameworks: Design and implement structured testing strategies that combine deterministic checks, semantic similarity metrics, and model-based evaluations.
- Automated Model Grading: Develop systems to evaluate clinical pipeline outputs for faithfulness, safety, and hallucination detection using various automated scoring techniques (e.g., BERTScore, ROUGE, or custom heuristics).
- Vibe-Driven Development: Leverage agentic AI tools to rapidly prototype complex test harnesses, "red-team" clinical logic, and build internal validation utilities at high velocity.
- Data Pipeline Integrity: Execute integration and regression tests for data-heavy backend processes, ensuring medical data remains consistent from ingestion to insight generation.
- Collaborative Strategy: Work closely with Data Scientists and Product Managers to define "Ground Truth" datasets and clinical evaluation rubrics.
- Root Cause Analysis: Deep-dive into complex system failures to identify whether issues stem from code logic, data drift, or model behavior.
Requirements
- 6+ years of technical experience in Quality Assurance, with a strong focus on system architecture and backend data validation.
- Advanced Python Proficiency: Expert-level skills in Python for building custom test scripts and working within AI/ML ecosystems.
- AI/ML Validation Experience: Proven experience testing model outputs using diverse metrics (e.g., Semantic Similarity, NLP metrics, and custom scoring algorithms). Familiarity with tools like LangSmith, Arize Phoenix, or MLflow.
- Vibe Coding Mastery: Proficiency with AI-native development tools such as Cursor, Windsurf, Claude Code.
- Modern Automation Stack: Proficiency with frameworks like PyTest, and experience with traditional tools (Cypress, Playwright, or Selenium) for end-to-end flows.
- API & Data Testing: Solid understanding of API testing (Postman/RestAssured) and database validation (SQL/NoSQL).
- Strategic Thinking: Experience serving as a Subject Matter Expert (SME) on test process development for large-scale systems.
Preferred (Good to Have)
- Clinical Domain Knowledge: Familiarity with healthcare data standards such as FHIR, HL7, or SNOMED CT.
- Healthcare Compliance: Understanding of HIPAA regulations and the nuances of handling ePHI/PHI.
- Statistical Foundation: Understanding of sensitivity, specificity, and F1-scores in a clinical context.