This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
HSBC Digital Business Services provides essential operational and technical support to global businesses, focusing on customer service and efficiency improvements. The role involves implementing SRE principles, managing reliability and performance, capacity management, vulnerability management, and leading disaster recovery efforts.
Job Responsibility:
Implementing SRE principles to ensure the reliability, availability, and performance of our applications
Develop and maintain monitoring and alerting systems to proactively identify and address potential issues
Manage tollgate processes to ensure that changes to the application are thoroughly tested and validated before deployment
Oversee capacity management to ensure that our applications can handle current and future loads
Lead vulnerability management efforts to identify and address security vulnerabilities in our applications.
Requirements:
Strong understanding of SRE principles and practices, including service level objectives (SLOs), service level indicators (SLIs), and service level agreements (SLAs)
Proficiency in resilience engineering concepts, including chaos engineering and fault tolerance
Excellent communication skills (written and verbal) in both Mandarin and English
Familiarity with cloud platforms and services, particularly Alicloud
Knowledge of monitoring and observability tools, such as Prometheus, Grafana, or ELK Stack
Bachelor’s degree in Computer Science, Information Technology, or a related field
Minimum of 5 years’ experience in a site reliability engineering or resilience engineering role
Experience in the financial industry is strongly preferred
Experience with incident management and root cause analysis processes
Familiarity with load testing and performance tuning
Experience with automation tools and scripting languages (e.g., Python, Bash)
Experience implementing and managing disaster recovery and business continuity plans
Experience in architecting resilient systems in areas such as digitization, core banking, big data, or regulatory reporting is preferred.
Nice to have:
Experience in the financial industry is strongly preferred
Familiarity with load testing and performance tuning
Experience in architecting resilient systems in areas such as digitization, core banking, big data, or regulatory reporting is preferred.
What we offer:
Continuous professional development
Flexible working
Opportunities to grow within an inclusive and diverse environment.
Welcome to
CrawlJobs.com
– Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.