Job Summary
The role focuses on designing, developing, and deploying LLM-powered Generative AI applications for diverse use cases such as chatbots, content generation, summarization, and code automation.
Key Responsibilities
- Design, develop, and optimize LLM-based applications for various use cases such as chatbots, content generation, summarization, and code generation.
- Design, develop, and deploy high-performance Generative AI applications using Python, LLMs, and deep learning frameworks.
- Write clean, modular, and scalable code while ensuring best practices in software development.
- Fine-tune and deploy large-scale open-source and proprietary language models (e.g., OpenAI GPT, LLaMA, Mistral, Falcon, Claude).
- Implement prompt engineering, retrieval-augmented generation (RAG), and model optimization techniques to improve AI performance.
- Work with vector databases (FAISS, ChromaDB, Weaviate, etc.) for efficient retrieval and indexing of large datasets.
Core Competencies
- Collaborate with cross-functional teams, including data scientists, MLOps engineers, and product managers, to integrate AI models into production systems.
- Optimize model performance and reduce latency using quantization, distillation, and pruning techniques.
- Stay up to date with cutting-edge research in AI, NLP, and deep learning to continuously improve solutions.
- Knowledge of cloud platforms (AWS, GCP, or Azure) and containerization (Docker, Kubernetes) for model deployment.
- Strong problem-solving skills and experience working in an agile development environment.
- 2-4 years of experience in AI/ML, deep learning, or software development.