LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Senior Recovery Lead and Head of Service Reliability

at HSBC

Back to all Cloud & DevOps jobs
HSBC logo
Investment Banking

Senior Recovery Lead and Head of Service Reliability

at HSBC

Tech LeadNo visa sponsorshipAWS/GCP/Azure DevOps

Posted 17 days ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Sheffield
Country
United Kingdom

Senior technology leader responsible for leading a global, follow-the-sun technical escalation team to accelerate incident diagnosis and reduce time to recover (TTR). The role combines hands-on incident recovery leadership with ownership of the strategic roadmap for resilience, automation, and self-healing capabilities. It requires partnering across Incident Management, SRE, Platform, Architecture, Risk, and Service Management to drive long-term remediation and preparedness for complex failures.

Join a digital first bank that’s powered by people.
Our technology team builds innovative digital solutions rapidly and at scale to deliver the next generation of banking services for our customers around the world.
We are seeking a senior technology leader to take on the dual role of Senior Recovery Lead and Global Head of Service Reliability. This is a highly visible, high-impact position reporting to the Global Head of Service Management, with a mandate to transform how we recover from incidents and build long-term service resilience.
This individual will lead a global team of technical experts who act as technical escalation partners during major incidents—helping reduce time to recover (TTR) through deep technical engagement, coordination, and engineering-driven solutions. Beyond recovery, this leader will also own the strategic and tactical roadmap for building reliable, self-healing systems through collaboration with Problem Management, SRE, and Platform teams
Job Requirements:
  • Lead a global, follow-the-sun team that acts as technical escalation partners during major incidents.
  • Partner with Incident Managers and Service Owners to accelerate incident diagnosis and resolution, reducing TTR and restoring services quickly and safely.
  • Bring calm, coordination, and engineering clarity to high-pressure recovery efforts.
  • Own and drive long-term remediation plans, including automation, reliability engineering, and platform guardrails to reduce future risk.
  • Track and govern follow-up actions to ensure completeness, accountability, and measurable reduction in incident recurrence.
  • Define and implement strategies for resilience engineering, including self-healing capabilities, automation of recovery workflows, and risk mitigation patterns.
  • Partner with Architecture and Engineering leaders to influence system design with reliability in mind.
  • Own the global incident scenario planning framework, ensuring that Technology is prepared to recover from widespread, complex failures.
  • Build, scale, and lead a high-performing global team with deep technical skills and a culture of urgency, ownership, and collaboration.
  • Act as a trusted partner and thought leader across Engineering, Infrastructure, Risk, and Service Management functions.
Qualification and Skills:
  • Proven experience in Site Reliability Engineering, Infrastructure, DevOps, or Technical Operations
  • Demonstrated experience leading global technical teams in complex, high-scale environments.
  • Deep expertise in incident recovery, automation, systems design, and platform reliability.
  • Strong working knowledge of problem management, root cause analysis frameworks, and resilience engineering principles.
  • Experience designing and running resilience exercises, chaos engineering, or incident scenario testing at scale.
  • Comfortable operating in regulated environments and partnering with Risk and Compliance functions.
  • Excellent stakeholder management and communication skills, with the ability to lead through influence at senior levels.
This role is based in Sheffield

Being open to different points of view is important for our business and the communities we serve. At HSBC, we’re dedicated to creating diverse and inclusive workplaces - no matter their gender, ethnicity, disability, religion, sexual orientation, or age. We are committed to removing barriers and ensuring careers at HSBC are inclusive and accessible for everyone to be at their best.

If you have a need that requires accommodations or changes during the recruitment process, please get in touch with our Recruitment Helpdesk:

Email: hsbc.recruitment@hsbc.com

Telephone: +44 207 832 8500

Senior Recovery Lead and Head of Service Reliability

at HSBC

Back to all Cloud & DevOps jobs
HSBC logo
Investment Banking

Senior Recovery Lead and Head of Service Reliability

at HSBC

Tech LeadNo visa sponsorshipAWS/GCP/Azure DevOps

Posted 17 days ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Sheffield
Country
United Kingdom

Senior technology leader responsible for leading a global, follow-the-sun technical escalation team to accelerate incident diagnosis and reduce time to recover (TTR). The role combines hands-on incident recovery leadership with ownership of the strategic roadmap for resilience, automation, and self-healing capabilities. It requires partnering across Incident Management, SRE, Platform, Architecture, Risk, and Service Management to drive long-term remediation and preparedness for complex failures.

Join a digital first bank that’s powered by people.
Our technology team builds innovative digital solutions rapidly and at scale to deliver the next generation of banking services for our customers around the world.
We are seeking a senior technology leader to take on the dual role of Senior Recovery Lead and Global Head of Service Reliability. This is a highly visible, high-impact position reporting to the Global Head of Service Management, with a mandate to transform how we recover from incidents and build long-term service resilience.
This individual will lead a global team of technical experts who act as technical escalation partners during major incidents—helping reduce time to recover (TTR) through deep technical engagement, coordination, and engineering-driven solutions. Beyond recovery, this leader will also own the strategic and tactical roadmap for building reliable, self-healing systems through collaboration with Problem Management, SRE, and Platform teams
Job Requirements:
  • Lead a global, follow-the-sun team that acts as technical escalation partners during major incidents.
  • Partner with Incident Managers and Service Owners to accelerate incident diagnosis and resolution, reducing TTR and restoring services quickly and safely.
  • Bring calm, coordination, and engineering clarity to high-pressure recovery efforts.
  • Own and drive long-term remediation plans, including automation, reliability engineering, and platform guardrails to reduce future risk.
  • Track and govern follow-up actions to ensure completeness, accountability, and measurable reduction in incident recurrence.
  • Define and implement strategies for resilience engineering, including self-healing capabilities, automation of recovery workflows, and risk mitigation patterns.
  • Partner with Architecture and Engineering leaders to influence system design with reliability in mind.
  • Own the global incident scenario planning framework, ensuring that Technology is prepared to recover from widespread, complex failures.
  • Build, scale, and lead a high-performing global team with deep technical skills and a culture of urgency, ownership, and collaboration.
  • Act as a trusted partner and thought leader across Engineering, Infrastructure, Risk, and Service Management functions.
Qualification and Skills:
  • Proven experience in Site Reliability Engineering, Infrastructure, DevOps, or Technical Operations
  • Demonstrated experience leading global technical teams in complex, high-scale environments.
  • Deep expertise in incident recovery, automation, systems design, and platform reliability.
  • Strong working knowledge of problem management, root cause analysis frameworks, and resilience engineering principles.
  • Experience designing and running resilience exercises, chaos engineering, or incident scenario testing at scale.
  • Comfortable operating in regulated environments and partnering with Risk and Compliance functions.
  • Excellent stakeholder management and communication skills, with the ability to lead through influence at senior levels.
This role is based in Sheffield

Being open to different points of view is important for our business and the communities we serve. At HSBC, we’re dedicated to creating diverse and inclusive workplaces - no matter their gender, ethnicity, disability, religion, sexual orientation, or age. We are committed to removing barriers and ensuring careers at HSBC are inclusive and accessible for everyone to be at their best.

If you have a need that requires accommodations or changes during the recruitment process, please get in touch with our Recruitment Helpdesk:

Email: hsbc.recruitment@hsbc.com

Telephone: +44 207 832 8500