
Lead Site Reliability Engineer
at J.P. Morgan
Posted 18 hours ago
No clicks
- Compensation
- Not specified USD
- City
- Jersey City
- Country
- United States
Currency: $ (USD)
Lead Site Reliability Engineer at JPMorgan Chase within Consumer and Community Bank. The role involves leading resiliency design reviews, breaking down complex problems for engineers, serving as a technical lead for medium to large products, and mentoring teammates. You will design and drive observability initiatives, implement logging/monitoring/tracing/alerting, and collaborate with cross-functional teams to embed SRE practices throughout the software development lifecycle.
Location: Jersey City, NJ, United States
Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.
Job responsibilities
- Lead the design, implementation, and maintenance of observability solutions across the organization.
- Develop and enforce best practices for observability, including logging, monitoring, tracing, and alerting.
- Write and maintain code in Java or similar language, Python, Angular or similar frameworks to build and enhance observability tools and platforms. Automate repetitive tasks to improve system reliability and developer productivity.
- Implement and driving adoption of SRE principles to improve system reliability, availability, and performance.
- Design and implement monitoring and alerting strategies to proactively identify and resolve issues.
- Ensure that observability tools provide actionable insights and are aligned with business objectives.
- Work closely with cross-functional teams to integrate observability practices into the software development lifecycle.
- Mentor and guide junior engineers, fostering a culture of learning and continuous improvement.
- Lead projects related to observability initiatives, ensuring timely delivery and alignment with strategic goals. Communicate effectively with stakeholders to provide updates and gather requirements.
- Function effectively in an agile environment, managing or contributing to backlog, velocity, and reporting on project landings
Required qualifications, capabilities, and skills
- Formal training or certification in software engineering concepts with 5+ years of applied experience.
- Strong understanding of SRE principles and practices.
- Advanced knowledge of observability tools and platforms (e.g., Dynatrace, Splunk, Grafana)
- Extensive experience in a similar SRE or observability role.
- Proven track record of implementing and managing observability solutions in complex environments.
- Excellent communication skills, both verbal and written, with the ability to convey complex technical concepts to non-technical stakeholders.
- Collaborative mindset, with the ability to work effectively with diverse teams and stakeholders.
- Strong analytical and problem-solving skills, with the ability to troubleshoot complex systems and drive root cause analysis.
- Ability to communicate data-based solutions with complex reporting and visualization methods.
- Drive to self-educate and evaluate new technology
- Proficiency in programming languages such as Java, Angular, Python and terraform (nice to have).

