ML/AI Engineer (LLM Production)

We are looking for a forward-thinking ML / AI Engineer to join our team and contribute to a confidential LLM project in FS domain. You will work on advanced AI implementation, covering the full model development lifecycle from data preparation and fine-tuning through to production deployment. This role is hands-on and delivery-focused. You will collaborate closely with cross-functional development team and directly shape the AI capabilities of a live, mission-critical system.

Gravitas Recruitment Group - Hong Kong - Full time

Salary: HKD30000 - HKD45000 per month

Key Responsibilities

Core Technical Areas

LLM Fine-Tuning & RAG: Fine-tune open-weight language models on domain-specific data using techniques such as SFT/Quantization. Build and optimize Retrieval-Augmented Generation (RAG) pipelines integrating vector databases and policy document ingestion.
Inference Optimization: Serve models efficiently using production inference engines (e.g., vLLM, SGLang). Apply quantization and batching strategies to meet strict latency and throughput SLAs.
MLOps & Deployment: Manage model deployment pipelines across DEV, UAT, PRE-PRD, and PRD environments on enterprise cloud infrastructure (IBM watsonx.ai / OpenShift).

AI-Assisted Development

Utilize modern LLMs to accelerate development - Python code generation, prompt engineering, and pipeline optimiz
Engage in prompt engineering to refine how systems interact with complex, multilingual datasets.

Research & Prototyping

Evaluate emerging open-source models, inference frameworks, and AI libraries for production feasibility.
Produce written validation reports and contribute to technical design and test documentation.

Talent Cultivation & Mentorship (What You Will Learn)

Broad Exposure: You will understand how LLM fine-tuning, RAG, inference optimization, and deployment pipelines interact in a real-world production system.
Technical Guidance: Work directly with senior engineers to learn how to move AI models from notebook experiments to production-ready, enterprise-grade code.
Impactful Work: Your contributions will directly power a live AI system handling real financial data at scale.

Requirements

Technical Requirements:

Degree in Computer Science, Data Science, AI, or a related field.
Hands-on experience with LLM fine-tuning, RAG pipelines, or model serving.
Strong proficiency in Python.
Solid understanding of machine learning fundamentals and deep learning frameworks (PyTorch, TensorFlow).
Familiarity with relevant libraries such as Hugging Face Transformers (e.g. Qwen/Deepseek), LangChain, LlamaIndex, or vLLM.
Ability to read and write technical documentation in English.
Proficient in utilizing cutting-edge AI tools (e.g. Claude Code, GPT-codex) to accelerate development cycles and conduct rapid feasibility studies (PoCs).

Nice-to-Haves:

Experience with LoRA, QLoRA, Unsloth, DPO, or RLHF fine-tuning techniques.
Familiarity with quantization (INT4, INT8) and production inference optimiz
Experience with vector databases (e.g., Milvus, pgvector).
Exposure to IBM watsonx.ai, OpenShift, or Kubernetes.
Experience with multilingual NLP, particularly CJK (Chinese, Japanese, Korean) datasets.
Prior hands-on experience with AI projects in a financial services or regulated industry context.

What We Offer

Flexible working hours and work-from-home policy.
Subsidized access to premium AI development tools to empower your workflow.
On-job training and technical guidance.
Opportunity to work on a high-impact, production LLM system in the financial services sector.
Exposure to a cutting-edge open-weight model stack and enterprise-grade deployment practices.

Apply

24154182

Specific Advice

Regionally Shared Advice

Totally Bazaar — Gift Gallery!

ML/AI Engineer (LLM Production)

Specific Advice

Regionally Shared Advice