Bulge Bracket Investment Banks

Site Reliability Engineer [Multiple Positions Available]

at J.P. Morgan

Mid LevelNo visa sponsorshipAWS/GCP/Azure DevOps

Posted 7 days ago

No clicks

Compensation: Not specified
City: Not specified
Country: United States

Columbus, OH-based Site Reliability Engineer responsible for monitoring system performance and reliability using Dynatrace, Prometheus, and Grafana, proactively resolving bottlenecks and outages. Create alerts to ticketing systems and build Grafana visualizations to improve monitoring; collaborate with development teams to streamline CI/CD pipelines using Jenkins. Integrate monitoring tools with Splunk application logs to enhance visibility into system health and user behavior, and automate metrics flow to the Satori time-series database using Python. Lead post-incident reviews to identify root causes and implement preventive measures; configure AWS accounts using Terraform and set up ECS clusters.

Location: Columbus, OH, United States

DESCRIPTION:

Duties: Monitor system performance and reliability using Dynatrace, Prometheus and Grafana, proactively resolving bottlenecks and outages. Create alerts to ticketing system, ensuring timely detection and responses to critical issues. Create visualizations in Grafana for detailed metrics, enhancing monitoring effectiveness. Collaborate with development teams to streamline CI/CD pipelines using Jenkins. Integrate monitoring tools with Splunk application logs, improving visibility into system health and user behavior. Automate metrics flow to time series database (Satori) using Python. Lead post- incident reviews to identify root causes and implement preventive measures. Configure AWS accounts using Terraform and setup ECS clusters.

QUALIFICATIONS:

Minimum education and experience required: Bachelor's degree in Software Engineering, Computer Science, Computer Engineering, or related field of study plus 5 years (60 months) of experience in the job offered or as Site Reliability Engineer, Technology Lead, IT Consultant, Software Engineer, or related occupation. The employer will alternatively accept a Master's degree in Software Engineering, Computer Science, Computer Engineering, or related field of study plus 3 years (36 months) of experience in the job offered or as Site Reliability Engineer, Technology Lead, IT Consultant, Software Engineer, or related occupation.

Skills Required: This position requires experience with the following: using scripting languages including but not limited to PowerShell, Python, and Shell to develop telemetry, observability, and maintenance of in place tools; Splunk data management and scripting to deliver telemetry and observability as a critical part of the overall infrastructure of a company; Visualization tools including Grafana and Dynatrace to use for the telemetry, observability and performance of the overall VDI estate for a company; Data management and using AWS database such as S3 for storage, maintenance, updates, and improvements of the underlying data to ensure the data is available and optimized for usage for telemetry; and Infrastructure troubleshooting including enterprise infrastructure architecture.

Job Location: 1111 Polaris Parkway, Columbus, OH 43240.

Back to all Cloud & DevOps jobs

Apply now