Role Overview
We are hiring a RAG Engineer to build the retrieval-augmented generation systems that ground Joveo's LLM-powered features in real, verifiable data. You will design retrieval pipelines, optimize vector indexes, and engineer the prompt orchestration that makes AI responses accurate, fast, and trusted at scale.
Key Responsibilities
- Design and build RAG pipelines — chunking, embedding, indexing, and retrieval
- Optimize vector databases for recall, latency, and cost at production scale
- Implement hybrid retrieval combining semantic search with metadata filtering
- Engineer prompt orchestration and context assembly for LLM inputs
- Build evaluation frameworks for retrieval quality and grounding accuracy
- Monitor and improve RAG systems in production based on real usage signals
Required Skills & Qualifications
- Hands-on experience building production RAG systems end-to-end
- Deep knowledge of embedding models, vector databases (Pinecone, Weaviate, pgvector, Qdrant)
- Experience with LLM orchestration frameworks (LangChain, LlamaIndex, DSPy)
- Strong Python skills and familiarity with modern LLM APIs (OpenAI, Anthropic, open-source)
- Understanding of evaluation techniques — RAGAS, retrieval precision, hallucination detection
- Ability to debug retrieval quality issues through systematic experimentation