LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Senior Site Reliability Engineer (SRE)

at Qube

Back to all Cloud & DevOps jobs
Qube logo
Hedge Funds

Senior Site Reliability Engineer (SRE)

at Qube

Mid LevelNo visa sponsorshipAWS/GCP/Azure DevOps

Posted 20 days ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Not specified
Country
Not specified

Join the Platform team to improve reliability, observability, and operability for a growing engineering platform at Qube Research & Technologies. You will own the observability platform, build low-noise dashboards and alerts, improve incident detection and response, and define SLIs/SLOs to drive operational decisions. The role involves hands-on engineering to improve scalability and automation, applying Infrastructure as Code and developing tooling (Go preferred, Python acceptable). You will partner with service teams to deliver measurable reliability improvements while keeping long-term service ownership with those teams.

Qube Research & Technologies (QRT) is a global quantitative and systematic investment manager, operating in all liquid asset classes across the world. We are a technology- and data-driven group implementing a scientific approach to investing. Combining data, research, technology, and trading expertise has shaped our collaborative mindset, which enables us to solve the most complex challenges. QRT’s culture of innovation continuously drives our ambition to deliver high-quality returns for our investors.

You will join the Platform team focused on improving reliability and day-to-day operability for an actively used and growing engineering platform. The team works closely with software engineers and platform owners to improve observability, incident response, and reliability outcomes, while keeping long-term service ownership with the teams that build and run the services.

Your Future Role within QRT

You will:

  • Own the effectiveness of the observability platform, ensuring high-quality signals, alert fidelity, and ongoing suitability as the platform scales
  • Build and maintain actionable, low-noise dashboards and alerting across metrics and logs
  • Improve incident detection, response, and follow-up, ensuring corrective actions are implemented in systems, configuration, or automation
  • Define and apply SLIs and SLOs where they support operational decision-making
  • Improve reliability, scalability, and operability of core services through hands-on engineering changes
  • Identify recurring failure patterns and reduce manual operational work through automation and improved defaults
  • Apply Infrastructure as Code across observability and supporting systems
  • Develop tooling and automation in Go (preferred) or Python
  • Introduce shared patterns, defaults, and documentation that reduce repeated bespoke work
  • Partner with service-owning teams to deliver measurable reliability improvements without transferring long-term service ownership to SRE

Your Present Skillset

  • Strong practical experience applying Site Reliability Engineering principles in production environments
  • Strong Linux systems knowledge
  • Experience building and operating containerised workloads (Docker or Podman)
  • Strong development experience in Go (preferred) or Python
  • Strong experience querying and reasoning about metrics using PromQL
  • Hands-on experience with Grafana, including dashboarding and alerting
  • Experience deploying and operating centralised logging systems
  • Strong Infrastructure as Code experience
  • OpenTelemetry experience (metrics, logs, traces)
  • Terraform and/or Ansible experience, plus familiarity with CI/CD pipelines
  • Kubernetes and cloud-native platform experience
  • Exposure to datacentre services and compute/hardware-backed platforms
  • AWS infrastructure configuration and deployment experience
  • Evidence of reducing operational load and recurring incidents in growing systems

QRT is an equal opportunity employer. We welcome diversity as essential to our success. QRT empowers employees to work openly and respectfully to achieve collective success. In addition to professional achievement, we are offering initiatives and programs to enable employees achieve a healthy work-life balance.

Senior Site Reliability Engineer (SRE)

at Qube

Back to all Cloud & DevOps jobs
Qube logo
Hedge Funds

Senior Site Reliability Engineer (SRE)

at Qube

Mid LevelNo visa sponsorshipAWS/GCP/Azure DevOps

Posted 20 days ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Not specified
Country
Not specified

Join the Platform team to improve reliability, observability, and operability for a growing engineering platform at Qube Research & Technologies. You will own the observability platform, build low-noise dashboards and alerts, improve incident detection and response, and define SLIs/SLOs to drive operational decisions. The role involves hands-on engineering to improve scalability and automation, applying Infrastructure as Code and developing tooling (Go preferred, Python acceptable). You will partner with service teams to deliver measurable reliability improvements while keeping long-term service ownership with those teams.

Qube Research & Technologies (QRT) is a global quantitative and systematic investment manager, operating in all liquid asset classes across the world. We are a technology- and data-driven group implementing a scientific approach to investing. Combining data, research, technology, and trading expertise has shaped our collaborative mindset, which enables us to solve the most complex challenges. QRT’s culture of innovation continuously drives our ambition to deliver high-quality returns for our investors.

You will join the Platform team focused on improving reliability and day-to-day operability for an actively used and growing engineering platform. The team works closely with software engineers and platform owners to improve observability, incident response, and reliability outcomes, while keeping long-term service ownership with the teams that build and run the services.

Your Future Role within QRT

You will:

  • Own the effectiveness of the observability platform, ensuring high-quality signals, alert fidelity, and ongoing suitability as the platform scales
  • Build and maintain actionable, low-noise dashboards and alerting across metrics and logs
  • Improve incident detection, response, and follow-up, ensuring corrective actions are implemented in systems, configuration, or automation
  • Define and apply SLIs and SLOs where they support operational decision-making
  • Improve reliability, scalability, and operability of core services through hands-on engineering changes
  • Identify recurring failure patterns and reduce manual operational work through automation and improved defaults
  • Apply Infrastructure as Code across observability and supporting systems
  • Develop tooling and automation in Go (preferred) or Python
  • Introduce shared patterns, defaults, and documentation that reduce repeated bespoke work
  • Partner with service-owning teams to deliver measurable reliability improvements without transferring long-term service ownership to SRE

Your Present Skillset

  • Strong practical experience applying Site Reliability Engineering principles in production environments
  • Strong Linux systems knowledge
  • Experience building and operating containerised workloads (Docker or Podman)
  • Strong development experience in Go (preferred) or Python
  • Strong experience querying and reasoning about metrics using PromQL
  • Hands-on experience with Grafana, including dashboarding and alerting
  • Experience deploying and operating centralised logging systems
  • Strong Infrastructure as Code experience
  • OpenTelemetry experience (metrics, logs, traces)
  • Terraform and/or Ansible experience, plus familiarity with CI/CD pipelines
  • Kubernetes and cloud-native platform experience
  • Exposure to datacentre services and compute/hardware-backed platforms
  • AWS infrastructure configuration and deployment experience
  • Evidence of reducing operational load and recurring incidents in growing systems

QRT is an equal opportunity employer. We welcome diversity as essential to our success. QRT empowers employees to work openly and respectfully to achieve collective success. In addition to professional achievement, we are offering initiatives and programs to enable employees achieve a healthy work-life balance.