LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Data Reliability Engineer

at Alphabet

Back to all Data Engineering jobs
A
Industry not specified

Data Reliability Engineer

at Alphabet

Mid LevelNo visa sponsorshipData Engineering

Posted 18 hours ago

No clicks

Compensation
Not specified USD

Currency: $ (USD)

City
Not specified
Country
United States

Join Vanguard's Personal Investor Data & Analytics - Data Reliability team to ensure data accuracy, availability, performance, and resilience from data entry into the lake to user consumption. You will help define observability best practices, establish SLIs/SLOs, track toil, and conduct blameless post-mortems, while collaborating with Data Engineers, Analysts, and product teams to resolve issues, optimize systems, and promote automation. The role requires proactive data pipeline analysis, reliability leadership, and designing strategies to localize failures, with experience in AWS, Python, SQL, and observability tools.

This position is on the Personal Investor Data & Analytics - Data Reliability team that is responsible for data reliability (accurate, available, performant, and resilient) from the point of entry into the lake until consumption by the user (PI business units). This will include participation in the definition of best practices for observability, establishing and maintaining service level indicators (SLIs) and service level objectives (SLO), tracking and addressing toil, conducting blameless root cause post-mortems, and incorporating preventative and proactive Reliability practices, among other items. This individual will partner with Data Engineers, Data Analysts, and source Product Team Engineers to identify root causes, resolve issues, optimize existing systems, enhance infrastructure, and promote automation to reduce effort and increase reliability.

Responsibilities:

  • Proactively analyzes data pipeline & platform logs and metrics to identify trends and potential issues. Participates in special projects and performs other duties as assigned.

  • Gain insights into PI Data & Analytics operations, demonstrates and champions Reliability culture and practices, builds relationships, and influences Reliability as a way of thought.

  • Exhibits proficiency in data reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other Reliability best practices.

  • Communicates progress, issues, trends, and solutions to management and partner organizations. Maintains proactive knowledge and understanding of pending elevations, enhancements, and infrastructure changes.

  • Proactively identifies potential failure points and designs strategies to ensure that failures remain localized, preventing widespread disruption and contagion.

  • Collaborates with internal teams to evaluate the health, stability, and reliability of systems/platforms.

  • Collaborates with product teams in triage and troubleshooting during client impacting incidents.

  • Participates in and/or facilitates post-incident reviews for any client-impacting events local to the Personal Investor Data & Analytics products.

  • Maintains centralized incident response playbook, in collaboration with DRE Champions on each product team.

  • Collaborates with DRE Champions and/or product team points of contacts to ensure adherence to the common operating model and standard development playbooks.

Qualifications:

  • Minimum of five years related experience, with at least two years of development experience.

  • Undergraduate degree or equivalent combination of training and experience.

  • 1-3 years of Reliability Engineering experience.

  • 2 years of DevOps experience.

  • Strong analytic and problem-solving skills.

  • Self-motivated individual with the ability to prioritize and manage changing priorities.

  • Experience and understanding of working in AWS data engineering products, Python, and SQL.

  • Proficiency and experience in observability, and telemetry tools such as Splunk, CloudWatch, Grafana, Datadog, etc.

Special Factors

Sponsorship

Vanguard is not offering visa sponsorship for this position.

About Vanguard

At Vanguard, we don't just have a mission—we're on a mission.

To work for the long-term financial wellbeing of our clients. To lead through product and services that transform our clients' lives. To learn and develop our skills as individuals and as a team. From Malvern to Melbourne, our mission drives us forward and inspires us to be our best.

How We Work

Vanguard has implemented a hybrid working model for the majority of our crew members, designed to capture the benefits of enhanced flexibility while enabling in-person learning, collaboration, and connection. We believe our mission-driven and highly collaborative culture is a critical enabler to support long-term client outcomes and enrich the employee experience.

Data Reliability Engineer

at Alphabet

Back to all Data Engineering jobs
A
Industry not specified

Data Reliability Engineer

at Alphabet

Mid LevelNo visa sponsorshipData Engineering

Posted 18 hours ago

No clicks

Compensation
Not specified USD

Currency: $ (USD)

City
Not specified
Country
United States

Join Vanguard's Personal Investor Data & Analytics - Data Reliability team to ensure data accuracy, availability, performance, and resilience from data entry into the lake to user consumption. You will help define observability best practices, establish SLIs/SLOs, track toil, and conduct blameless post-mortems, while collaborating with Data Engineers, Analysts, and product teams to resolve issues, optimize systems, and promote automation. The role requires proactive data pipeline analysis, reliability leadership, and designing strategies to localize failures, with experience in AWS, Python, SQL, and observability tools.

This position is on the Personal Investor Data & Analytics - Data Reliability team that is responsible for data reliability (accurate, available, performant, and resilient) from the point of entry into the lake until consumption by the user (PI business units). This will include participation in the definition of best practices for observability, establishing and maintaining service level indicators (SLIs) and service level objectives (SLO), tracking and addressing toil, conducting blameless root cause post-mortems, and incorporating preventative and proactive Reliability practices, among other items. This individual will partner with Data Engineers, Data Analysts, and source Product Team Engineers to identify root causes, resolve issues, optimize existing systems, enhance infrastructure, and promote automation to reduce effort and increase reliability.

Responsibilities:

  • Proactively analyzes data pipeline & platform logs and metrics to identify trends and potential issues. Participates in special projects and performs other duties as assigned.

  • Gain insights into PI Data & Analytics operations, demonstrates and champions Reliability culture and practices, builds relationships, and influences Reliability as a way of thought.

  • Exhibits proficiency in data reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other Reliability best practices.

  • Communicates progress, issues, trends, and solutions to management and partner organizations. Maintains proactive knowledge and understanding of pending elevations, enhancements, and infrastructure changes.

  • Proactively identifies potential failure points and designs strategies to ensure that failures remain localized, preventing widespread disruption and contagion.

  • Collaborates with internal teams to evaluate the health, stability, and reliability of systems/platforms.

  • Collaborates with product teams in triage and troubleshooting during client impacting incidents.

  • Participates in and/or facilitates post-incident reviews for any client-impacting events local to the Personal Investor Data & Analytics products.

  • Maintains centralized incident response playbook, in collaboration with DRE Champions on each product team.

  • Collaborates with DRE Champions and/or product team points of contacts to ensure adherence to the common operating model and standard development playbooks.

Qualifications:

  • Minimum of five years related experience, with at least two years of development experience.

  • Undergraduate degree or equivalent combination of training and experience.

  • 1-3 years of Reliability Engineering experience.

  • 2 years of DevOps experience.

  • Strong analytic and problem-solving skills.

  • Self-motivated individual with the ability to prioritize and manage changing priorities.

  • Experience and understanding of working in AWS data engineering products, Python, and SQL.

  • Proficiency and experience in observability, and telemetry tools such as Splunk, CloudWatch, Grafana, Datadog, etc.

Special Factors

Sponsorship

Vanguard is not offering visa sponsorship for this position.

About Vanguard

At Vanguard, we don't just have a mission—we're on a mission.

To work for the long-term financial wellbeing of our clients. To lead through product and services that transform our clients' lives. To learn and develop our skills as individuals and as a team. From Malvern to Melbourne, our mission drives us forward and inspires us to be our best.

How We Work

Vanguard has implemented a hybrid working model for the majority of our crew members, designed to capture the benefits of enhanced flexibility while enabling in-person learning, collaboration, and connection. We believe our mission-driven and highly collaborative culture is a critical enabler to support long-term client outcomes and enrich the employee experience.

SIMILAR OPPORTUNITIES

No similar jobs available at the moment.