About the Role
At JetBrains, we build developer tools used by millions of engineers. The AI for Code team works on the next generation of coding agents and agentic workflows: systems that can understand codebases, plan and execute multi-step tasks, collaborate with developers, and ship reliable results inside real development environments.
We are looking for a Staff/Senior AI Engineer to join the team and support these efforts. This role is for someone who can take our internal coding models, such as Mellum2, as well as open-weight models, and turn them into production-ready coding agents for our users. You’ll work on model training and fine-tuning, context engineering, tool use, evaluation, feedback loops, and product integration. This is not research in isolation – you’ll build systems that are used by tens of thousands of developers.
What you’ll do:
- Build production-ready coding agents and agentic workflows for real developer tasks inside JetBrains products.
- Turn promising model capabilities into dependable product behavior through prompt design, context construction, fine-tuning, instruction-tuning, or other post-training techniques where appropriate.
- Design and improve the agent loop itself, including tool use, execution strategy, safeguards, and task completion quality.
- Create evaluation suites and quality infrastructure for agent behavior, including online and offline evaluations, regression checks, failure analysis, and release criteria.
- Build feedback loops from real usage, using logs, user signals, and edge cases to improve data, evaluations, and agent behavior.
- Work with both hosted frontier APIs and self-hosted or open-weight models, making pragmatic decisions about where each model belongs based on capability, latency, reliability, privacy, and cost.
- Collaborate closely with product managers, software engineers, ML engineers, and researchers to ship features end to end.
- Help define the technical direction for future work, especially in ambiguous areas where we need strong judgment rather than a prewritten playbook.
What we’re looking for:
- Strong software engineering fundamentals and a track record of shipping complex systems to production.
- Hands-on experience building LLM-powered products, coding agents, or other AI systems.
- Experience improving model behavior through systematic iteration, whether via prompting, context engineering, fine-tuning, preference optimization, or broader post-training methods.
- Practical experience with evaluation and benchmarking for LLM systems, including defining task-grounded success metrics and catching regressions.
- Experience working from noisy real-world signals rather than only from clean benchmark datasets.
- Good judgment about trade-offs between model quality, latency, reliability, privacy, and cost.
- Confidence working with ambiguity and taking ownership of a direction over multiple iterations.
- Strong communication skills and the ability to align engineering and product decisions.
What success looks like in the first year:
- You ship one or more agent capabilities that users can rely on for meaningful work, not just demos.
- You establish better evaluation coverage and clearer release criteria for agent behavior.
- You help the team build a repeatable loop from idea to shipped capability: prototype, evaluate, learn from usage, improve, and scale.
Why join us?
You’ll help define what practical, trustworthy AI for software development looks like in real products. You’ll work on challenging problems at the boundary of model capability and product reality, with the freedom to stay hands-on and the scope to influence how the next generation of JetBrains AI systems is built.