Lead Support Analyst, Shared Services & Production Management, IT

Position Description We are seeking an experienced Support Analyst who will be responsible and accountable for both build and shared services operations, including monitoring, site reliability engineering (SRE), and ensuring the stability and performance of critical systems. The ideal candidate is a strong technical problem‑solver who is self‑motivated, eager to learn cross‑functional technologies, and capable of delivering end‑to‑end monitoring solutions while diagnosing complex issues during c

CITIC CLSA - Hong Kong - Full time

Salary: Competitive

Position Description

We are seeking an experienced Support Analyst who will be responsible and accountable for both build and shared services operations, including monitoring, site reliability engineering (SRE), and ensuring the stability and performance of critical systems.

The ideal candidate is a strong technical problem‑solver who is self‑motivated, eager to learn cross‑functional technologies, and capable of delivering end‑to‑end monitoring solutions while diagnosing complex issues during critical incidents. This role requires someone who can maintain a positive, composed, and solutions‑focused attitude in challenging situations and is passionate about continuous learning and improvement.

Key Areas of Responsibilities

  1. Support monitoring and SRE operations, ensuring system reliability, availability, and performance.
  2. Build, enhance, and maintain monitoring solutions using ITRS Geneos, Prometheus, Victoria‑Metrics, Elasticsearch, and Grafana.
  3. Develop, optimize, and maintain alerting rules, dashboards, and observability pipelines.
  4. Troubleshoot Linux servers (RHEL 7/8/9), including upgrades, configurations, patching, and maintenance, while determining appropriate monitoring requirements for system changes.
  5. Analyze logs, investigate issues, and perform fault finding to identify performance exceptions.
  6. Collaborate with engineering, application, and infrastructure teams to improve system resilience, stability, security, efficiency, and scalability.
  7. Participate in on‑call rotations, including off‑hours and scheduled weekend support.
  8. Participate in Disaster Recovery (DR) and Business Continuity Planning (BCP) drills.
  9. Continuously research and adopt modern monitoring and SRE tools and practices.

Requirements

  1. Bachelors degree in Computer Science / Engineering
  2. 8-10 years experience within IT / Investment bank.
  3. Strong experience with monitoring and observability platforms, including: ITRS Geneos, Prometheus, Victoria‑Metrics, Elasticsearch, Grafana, and Kibana.
  4. Hands-on experience building and implementing Prometheus pipelines, including exporters, scraping configurations, relabelling, metric routing, and integrations with long‑term storage (e.g., Victoria‑Metrics).
  5. Experience building and maintaining Logstash pipelines, including ingestion, parsing, filtering, enrichment, and routing of logs into Elasticsearch.
  6. Ability to design, build, and maintain Grafana and Kibana dashboards for metrics, logs, and performance analytics across distributed systems.
  7. Understanding of metrics, logging, alerting, dashboards, and observability pipelines.
  8. Strong Linux administration skills (RHEL 7/8/9), including troubleshooting, upgrades, configuration, patching, and performance optimization.
  9. Good understanding of SRE principles, high availability, scalability, incident management and DR (Disaster Recovery) / BCP (Business Continuity Planning) activities
  10. Experience with automation (e.g., Bash, Python, Ansible, CI/CD tools) is an advantage.
  11. Understanding of networking fundamentals, performance tuning, and troubleshooting distributed systems.
  12. Prior experience in Production Support, SRE, Monitoring Engineering, or Shared Services Operations with participation in on‑call rotations, including after-hours and weekend support.
  13. Self-motivated, adaptable and able to prioritize, learn continuously and manage multiple responsibilities effectively.
  14. Fluent in English and Chinese

24152524
Ad