
Cloud Infrastructure Engineer
at McKinsey & Company
Posted 6 days ago
No clicks
- Compensation
- Not specified
- City
- Not specified
- Country
- Costa Rica
Currency: Not specified
Drive end-to-end migrations, own problem management and RCA discussions, and lead cost optimization and infrastructure redesign to ensure product reliability and scalability. Partner with cross-functional teams to identify automation opportunities, define SLIs/SLOs/SLAs, and troubleshoot incidents while assisting users. Provide DevOps/SRE support, including on-call coverage, mentorship, and documentation to strengthen observability, logging, and CI/CD foundations across cloud-native and containerized platforms. Located in San Jose, Costa Rica within the Secure Foundations Product Tooling Services Group to support strategic platform initiatives.
Your Impact
You will drive migrations end-to-end and collaborate with stakeholders. You will own Problem Management and drive root cause analysis (RCA) discussions, with accountability for cost optimization efforts, infrastructure re-design where needed, and product reliability and scalability. You will partner with other team members to identify cost optimization and automation opportunities, help define and maintain service level indicators (SLIs, SLOs, SLAs, and error budgets), and troubleshoot incidents while helping users with issues and requests. You will define standards, guidelines, and best practices, bring in new tools, conduct proofs of concept (POCs) with development teams, and define product readiness for SREs. You will create documentation covering configuration, operations, and troubleshooting procedures, identify new patterns, and participate in architectural discussions to improve product scalability and stability. You will also provide DevOps/SRE support for planned and unplanned work, including team on-call support, and provide technical guidance and mentorship to the team.
Your work will help maintain and improve the Developer, Observability, and Logging Platforms by strengthening observability, logging, monitoring, and alerting capabilities, as well as secrets management and CI/CD foundations. By leading root cause analysis for support escalation issues and supporting cloud-native and containerized platforms, your role directly contributes to reliable, well-monitored, and well-operated systems used by product and support teams.
You will be located in San Jose, Costa Rica as part of our Product Tooling Services Group within Secure Foundations.
Your Growth
- Continuous learning: Our learning and apprenticeship culture, backed by structured programs, is all about helping you grow while creating an environment where feedback is clear, actionable, and focused on your development. The real magic happens when you take the input from others to heart and embrace the fast-paced learning experience, owning your journey.
- A voice that matters: From day one, we value your ideas and contributions. You’ll make a tangible impact by offering innovative ideas and practical solutions, all while upholding our unwavering commitment to ethics and integrity. We not only encourage diverse perspectives, but they are critical in driving us toward the best possible outcomes.
- Global community: With colleagues across 65+ countries and over 100 different nationalities, our firm’s diversity fuels creativity and helps us come up with the best solutions. Plus, you’ll have the opportunity to learn from exceptional colleagues with diverse backgrounds and experiences.
- Exceptional benefits: On top of a competitive salary (based on your location, experience, and skills), we provide a comprehensive benefits package to enable holistic well-being for you and your family.
Your qualifications and skills
- 3+ years of industry experience with Software Engineering best practices
- Programming languages: Python, Golang, JavaScript and others (preferred)
- IaC languages: Terraform,
- CI/CD tooling: CircleCI, GoCD, Jenkins, GitHub, Jfrog
- Secrets Manager: AWS Secret Manager, Vault
- Test automation frameworks: test-kitchen, awspec, inspec and others
- Public cloud platforms: AWS, Azure, GCP
- Containerization: Kubernetes, Docker, helm
- Web front-end and database products: Nginx, Postgres, MongoDB, Redis, AWS RDS
- Observability and data analytics tools: Dynatrace, Cribl and others
- Networking experience: load balancing, network security, standard network protocols (HTTP/s, DNS, etc.)
- Operating system experience: Linux, Windows
- Experience in Project Management
- Ability to lead building and implementing services and tools to make products and support better at their jobs
- Expertise in leading root cause analysis to help fix support escalation issues
Familiarity with agile concepts and working experience in agile team culture
FOR U.S. APPLICANTS: McKinsey & Company is an Equal Opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by applicable law.
FOR NON-U.S. APPLICANTS: McKinsey & Company is an Equal Opportunity employer. For additional details regarding our global EEO policy and diversity initiatives, please visit our McKinsey Careers and Diversity & Inclusion sites.
Job Skill Code - PRSE - Product Reliability Engineer II
Function - Technology
Industry - High Tech
Post to LinkedIn - Yes
Posted to LinkedIn Date - Wed Feb 04 00:00:00 GMT 2026
LinkedIn Posting City - San Jose
LinkedIn Posting State/Province -
LinkedIn Posting Country - Costa Rica
LinkedIn Job Title - Cloud Infrastructure Engineer
LinkedIn Function - Information Technology
LinkedIn Industry - Management Consulting
LinkedIn Seniority Level - Not Applicable

