logo

Anthropic

Research Engineer, Alignment Science

Department
Engineering
Job Type / Location
London
Experience Required
3+ years
Posted On

About the Role

As a Research Engineer on Alignment Science, you'll contribute to exploratory experimental research on AI safety, with a focus on risks from powerful future systems (like those we would designate as ASL-3 or ASL-4 under our Responsible Scaling Policy), often in collaboration with other teams including Interpretability, Fine-Tuning, and the Frontier Red Team. You will build and run elegant and thorough machine learning experiments to help us understand and steer the behavior of powerful AI systems, with a strong interest in making AI helpful, honest, and harmless, especially in the context of human-level capabilities. This role is ideal for individuals who identify as both a scientist and an engineer.

Representative Projects

  • Testing the robustness of safety techniques by training language models to subvert them and evaluating intervention effectiveness.
  • Running multi-agent reinforcement learning experiments to test techniques like AI Debate.
  • Building tooling to efficiently evaluate the effectiveness of novel LLM-generated jailbreaks.
  • Writing scripts and prompts to efficiently produce evaluation questions for testing models’ reasoning abilities in safety-relevant contexts.
  • Contributing ideas, figures, and writing to research papers, blog posts, and talks.
  • Running experiments that feed into key AI safety efforts at Anthropic, such as the design and implementation of the Responsible Scaling Policy.

Requirements

You may be a good fit if you:

  • Have significant software, ML, or research engineering experience.
  • Have some experience contributing to empirical AI research projects.
  • Have some familiarity with technical AI safety research.
  • Prefer fast-moving collaborative projects to extensive solo efforts.
  • Pick up slack, even if it goes outside your job description.
  • Care about the impacts of AI.

Strong candidates may also:

  • Have experience authoring research papers in machine learning, NLP, or AI safety.
  • Have experience with LLMs.
  • Have experience with reinforcement learning.
  • Have experience with Kubernetes clusters and complex shared codebases.

Candidates need not have:

  • 100% of the skills needed to perform the job.
  • Formal certifications or education credentials.

View Assessment Process

Think you'll be a good fit?