Responsibility
- Operate, maintain, and continuously enhance the enterprise AI Platform and Data Lakehouse, focusing on automation, scalability, and operational stability.
- Build and refine end to end DataOps and MLOps pipelines across the full ML lifecycle.
- Engineer core platform capabilities including containerisation, orchestration, deployment, logging, and observability.
- Create reusable platform services and seamlessly integrate internal and external APIs / MCPs.
- Enable large scale distributed data and ML workloads on Docker, Kubernetes, Spark, Airflow, and Databricks.
- Establish robust monitoring and alerting for platform health, data pipelines, and ML services.
- Drive innovation through technology evaluation, best practice adoption, and platform optimization.
- Lead and support PoCs, platform validation, and acceptance testing.
- Partner closely with Architects, Platform Engineers, and Product Owners in a CI/CD environment.
- Collaborate with AI Engineers, Data Engineers, and business teams to deliver production ready AI solutions.
Requirements
- Bachelor's degree or above in Computer Science, Software Engineering, or a related field.
- 4+ years of hands on experience in platform, infrastructure, or large scale systems engineering.
- Proven track record delivering enterprise AI or data platforms.
- Strong practical experience in DataOps and MLOps, including model lifecycle automation, deployment, monitoring, and governance.
- Proficiency in Python, Bash, and SQL.
- Hands on experience with Docker, Kubernetes, Helm, and Terraform.
- Experience with CI/CD and automation tools such as Jenkins, Git, and Ansible.
- Practical exposure to Airflow, Spark, Databricks, Jupyter, and MLflow.
- Experience building monitoring and observability solutions using Grafana, Prometheus, and Loki.
If you're interested in this role, please forward your latest resume to cheryl.NG@hays.com.hk or contact Cheryl Ng at +852 2101 0081.