ML Engineer - LLM & RAG
Location
Jakarta, Indonesia
Type
Full-time
Department
AI Engineering
Salary
$160K - $220K
Position Description:
We’re seeking an exceptional ML Engineer specializing in Large Language Models and Retrieval-Augmented Generation to join our AI Engineering team. You’ll architect and productionize RAG systems, fine-tune foundation models, and build intelligent AI features that directly impact client products. This is a technical, hands-on role where you’ll work with cutting-edge models (GPT-5, Claude 3, Llama 2) and deploy systems that process millions of queries daily.
Responsibilities:
- Design and implement production RAG systems using vector databases (Pinecone, Weaviate) and LLM APIs
- Fine-tune open-source models (Llama 2, Mistral) using LoRA/QLoRA for domain-specific applications
- Optimize LLM performance through prompt engineering, context management, and evaluation frameworks
- Build evaluation pipelines to measure retrieval quality, answer accuracy, and model hallucination rates
- Collaborate with backend engineers to integrate AI features into FastAPI/Node.js services
- Research and implement state-of-the-art techniques in document chunking, embedding strategies, and retrieval optimization
Qualifications:
- MS or PhD in Computer Science, ML, AI, or related field (or equivalent experience)
- 3+ years of hands-on experience deploying ML models to production
- Strong Python skills; experience with PyTorch, TensorFlow, or JAX
- Deep understanding of transformer architectures and LLM limitations
- Production experience with vector databases and embedding models
- Familiarity with LangChain, LlamaIndex, or similar RAG frameworks
- Experience with MLOps tools: MLflow, Weights & Biases, DVC, or Kubeflow
- Published research or open-source contributions in NLP/LLMs is a plus