Key Responsibilities

Build and optimize the LLM abstraction layer using LiteLLM or AWS Bedrock for seamless model swaps without application code changes
Design and implement high-performance RAG pipelines including ingestion, chunking, embedding, indexing, and retrieval
Develop reusable prompt libraries and structured output parsing with schema validation for LLM responses
Optimize context window usage and token budgeting for efficient LLM interactions
Implement hybrid retrieval strategies (dense + sparse/BM25) and reranking for improved accuracy
Run prompt and retrieval evaluation experiments using frameworks like Ragas, DeepEval, or Langfuse

Requirements

Strong knowledge of modern LLMs including Claude, GPT-4, Llama, and Mistral
Hands-on experience with LiteLLM, AWS Bedrock, or Azure OpenAI
Production experience implementing RAG systems using LlamaIndex or LangChain
Experience with vector databases such as OpenSearch, Qdrant, Weaviate, or Milvus
Strong Python programming skills and familiarity with embedding models

LLM Knowledge Engineer

View Assessment Process