Bulge Bracket Investment Banks

AWS Java Microservices Lead Software Engineer

at J.P. Morgan

Mid LevelNo visa sponsorshipJava

Posted a day ago

No clicks

Compensation: Not specified
City: Not specified
Country: United States

Lead Software Engineer at JPMorganChase focusing on AWS-based Java microservices, delivering reliable, scalable services with strong SRE and observability practices. You will drive automation, testing, release management, incident response, and performance engineering across agile sprints. The role emphasizes cloud-native design, CI/CD, and close collaboration with product teams to mature testing and readiness for production.

Location: Wilmington, DE, United States

We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.

As a Lead Software Engineer at JPMorganChase within the [insert LOB or sub LOB], you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. As a core technical contributor, you are responsible for conducting critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.

Job responsibilities

Engage with development team throughout agile sprints to develop software for reliability and scale, ensuring minimal refactoring or changes
Identify application patterns and analytics in support of better service level objectives. Design automated software and product upgrades, change management, and release management solutions.
Deep Experience in Operating Services in Public Cloud, Strong grasp of SRE principles; SLIs/ SLOs, error budgets, incident management, observability, and resilience patterns. Hands-on with observability and incident tooling, Proficiency with CI/CD and deployment strategies
Perform year-over-year analysis of production issues (e.g., P1–P3) to identify top failure modes, recurrence patterns, and control gaps
Drive prioritized remediation programs across change/configuration, capacity/performance, dependency resilience, and code quality.
Troubleshoot priority and escalation incidents, facilitate blameless post-mortems and ensure permanent closure of incidents and subsequent problem tasks.
Establish comprehensive automated functional testing with dependable regression suites integrated into CI/CD to gate releases; improve reliability and speed through robust test data and include non-functional checks (performance, resilience, accessibility) in pre‑prod and readiness reviews.
Implement demand forecasting, load testing, and performance engineering in pre-prod; validate scale assumptions before peak events.
Run game days and chaos experiments to validate failover, degraded-mode operation, and dependency timeouts.
Embed shift‑left quality and partner with Product to mature testing practices: co‑define clear acceptance criteria and Definition of Ready/Done, align coverage to critical user journeys, and track quality KPIs (defect escape rate, automated coverage on key paths, change failure rate) tied to service objectives and release readiness.
Cloud platform and automation

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 5+ years applied experience
Experience in AWS-based API development using Java, with proficiency in RESTful API development and related tools such as Postman and Swagger/OpenAPI.
Proficient in utilizing AWS services (e.g., Lambda, API Gateway, S3, EC2, IAM, Event Bridge) to design and deploy API-driven solutions, with a focus on cost-efficiency, scalability, and performance.
Skilled in implementing Infrastructure as Code (IaC) using tools like Terraform.
Proficiency and experience in observability such as white and black box monitoring, SLO alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.
Experience with AWS cloud monitoring tools like Datadog and CloudWatch.
An advanced understanding of site reliability culture and principles and a track record of demonstrating how to implement site reliability within an application or platform and usage of key SRE concepts such as SLOs and Error Budgets
Advanced knowledge and experience in observability capabilities across applications (metrics, tracing, SLOs), alerting, telemetry collection and ability to design critical and golden signal monitoring and dashboards
Solid understanding of agile methodologies, including CI/CD, application resiliency, and security, with experience in developing, debugging, and maintaining code in a large corporate environment using modern programming and database querying languages.

Preferred qualifications, capabilities, and skills

Experience instituting production readiness standards and error-budget policies across multiple product teams.
Background in performance engineering and capacity planning for high-traffic, customer-facing systems
Thorough understanding of Automated Functional Testing / Regression Testing and integration of the same in TrueCD
Cloud / SRE certifications

Carry out critical tech solutions across multiple technical areas as an integral part of an agile team

Back to all Java jobs

Apply now