Bulge Bracket Investment Banks

Site Reliability Engineer

at Barclays

Mid LevelNo visa sponsorshipPython

Posted 6 hours ago

No clicks

Compensation: Not specified
City: Praha
Country: Czech Republic

Join Barclays as a Site Reliability Engineer and play a key role in building a new, high-impact SRE capability within Markets Post-Trade. You will expand application stability and reliability measurement by automating reliability tooling, closing telemetry gaps, and addressing reliability findings across multiple mission-critical systems. You will drive full-stack observability with dashboards and end-to-end transaction tracing, enabling faster issue resolution and AI-driven observability. The role focuses on pre-emptive monitoring, optimisation, and non-functional architecture design to ensure resilient, high-performing systems in a fast-paced environment.

Join Barclays as a Site Reliability Engineer and play a key role in building a new, high-impact SRE capability within Markets Post-Trade. As part of a cross-cutting team, you will expand application stability and reliability measurement by automating reliability tooling, closing telemetry gaps, and addressing reliability findings across multiple mission-critical systems. You will help extend and scale an SRE solution across Markets Post-Trade, driving full-stack observability for cash settlements, securities settlement, and liquidity management flows. Through centralised dashboards and end-to-end transaction tracing, you will deliver greater transparency, faster issue resolution, and enable the adoption of AI-driven observability, anomaly detection, and advanced analytics. This role focuses on pre-emptive monitoring, optimisation, and non-functional architecture design to ensure resilient, high-performing systems in a fast-paced environment.

To be successful in this role, you will need the following:

Experience with observability and APM tools such as OpenTelemetry, Elastic, AppDynamics, or Prometheus.
Experience designing and implementing resilience patterns, including Retry, Timeout, Circuit Breaker, Bulkhead, Throttling, and Saga.
Proficiency with load-testing tools such as HP Performance Center, LoadRunner, k6, or JMeter.
Solid knowledge of networking and security fundamentals, including VPC design, IAM, encryption, and secrets management.
Operational experience with scripting and/or programming languages such as Java, Python, Ruby, or Bash.

Some other highly valued skills may include:

Experience in the financial services industry.
Experience with infrastructure-as-code tools such as Chef and Ansible.
Working knowledge of CI/CD tools including GitLab, Jenkins, Nolio, and TeamCity.
Experience operating in Red Hat, Windows, and Kubernetes environments.
Familiarity with alerting and monitoring tools such as Geneos ITRS.

You may be assessed on the key critical skills relevant for success in role, such as risk and controls, change and transformation, business acumen, strategic thinking and digital and technology, as well as job-specific technical skills.

The successful candidate will be based in Prague.

Purpose of the role

To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them.

Accountabilities

Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.
Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring.
Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.
Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.
Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations.
Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth.

Assistant Vice President Expectations

To advise and influence decision making, contribute to policy development and take responsibility for operational effectiveness. Collaborate closely with other functions/business divisions.
Lead a team performing complex tasks, using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomes.
If the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an environment for colleagues to thrive and deliver to a consistently excellent standard. The four LEAD behaviours are: L – Listen and be authentic, E – Energise and inspire, A – Align across the enterprise, D – Develop others.
OR for an individual contributor, they will lead collaborative assignments and guide team members through structured assignments, identify the need for the inclusion of other areas of specialisation to complete assignments. They will identify new directions for assignments and/or projects, identifying a combination of cross functional methodologies or practices to meet required outcomes.
Consult on complex issues; providing advice to People Leaders to support the resolution of escalated issues.
Identify ways to mitigate risk and developing new policies/procedures in support of the control and governance agenda.
Take ownership for managing risk and strengthening controls in relation to the work done.
Perform work that is closely related to that of other areas, which requires understanding of how areas coordinate and contribute to the achievement of the objectives of the organisation sub-function.
Collaborate with other areas of work, for business aligned support areas to keep up to speed with business activity and the business strategy.
Engage in complex analysis of data from multiple sources of information, internal and external sources such as procedures and practises (in other areas, teams, companies, etc.) to solve problems creatively and effectively.
Communicate complex information. 'Complex' information could include sensitive information or information that is difficult to communicate because of its content or its audience.
Influence or convince stakeholders to achieve outcomes.

All colleagues will be expected to demonstrate the Barclays Values of Respect, Integrity, Service, Excellence and Stewardship – our moral compass, helping us do what we believe is right. They will also be expected to demonstrate the Barclays Mindset – to Empower, Challenge and Drive – the operating manual for how we behave.

Back to nav

Back to all Python jobs

Apply now