
Platform Engineer (Cloud Infrastructure, AI Platform)
at Allianz
Posted 6 hours ago
No clicks
- Compensation
- Not specified
- City
- Not specified
- Country
- Not specified
Currency: Not specified
Platform Engineer within Advanced Analytics (DA3) at Allianz Partners to build and operate cloud infrastructure powering AI-enabled solutions at global scale. You will implement, automate, and maintain platform foundations (Kubernetes, CI/CD, observability, security) enabling teams to deploy AI services reliably. Collaborate with Backend, ML Engineers, AI Architects, and Platform Architects, translating platform architecture into production infrastructure, and reducing toil through automation. Focus on reliability, security, and scalable operations across clusters, pipelines, and cloud services.
Description
Key responsibilities
As a Platform Engineer within Advanced Analytics (DA3) in the Chief Data & AI Office area at Allianz Partners, you will join our central AI team to build and operate the cloud infrastructure that powers AI-enabled solutions at global scale.
We are looking for an engineer with deep Kubernetes and cloud expertise to implement, automate, and maintain the platform foundations that enable teams to deploy and operate AI services reliably.
You will work in a cross-functional environment with Backend Engineers, ML Engineers, AI Architects, and Platform Architects, taking hands-on ownership of the infrastructure layer, from Kubernetes clusters and CI/CD pipelines to observability systems and security controls.
In this role, you will translate platform architecture into working infrastructure, reduce operational toil through automation, and ensure production systems meet reliability and security standards.
Through this role, you will have the main following responsibilities:
- Implement and operate Kubernetes infrastructure (AKS): cluster lifecycle, networking, resource management, auto-scaling, and multi-tenancy patterns.
- Build and maintain CI/CD pipelines using GitHub Actions and ArgoCD for automated testing, container builds, and GitOps deployments.
- Develop Infrastructure as Code (Terraform, Bicep) to provision and manage Azure resources with consistency and auditability.
- Operate container registries (ACR), artifact management, and image security scanning workflows.
- Implement and maintain observability infrastructure: Azure Monitor, Application Insights, Prometheus, Grafana—including dashboards, alerting, and distributed tracing.
- Manage async processing infrastructure: Celery workers, Redis queues, and workflow orchestration patterns supporting AI agent execution.
- Implement platform security controls: network policies, pod security standards, Key Vault integration, RBAC, and private endpoint configurations.
- Support database infrastructure: PostgreSQL management, backup/recovery, connection pooling, and performance tuning.
- Create self-service tooling and templates that enable development teams to deploy and operate services with minimal friction.
- Diagnose and resolve infrastructure issues across clusters, pipelines, and cloud services; perform root-cause analysis and implement preventative improvements.
- Collaborate with Platform Architects, Backend Engineers, and ML Engineers to translate architecture designs into reliable infrastructure.
What you bring
- 5+ years professional experience in platform engineering, SRE, or DevOps roles; experience supporting AI/ML workloads is a strong plus.
- Strong Kubernetes experience: cluster operations, networking (Ingress, network policies), storage, autoscaling, and troubleshooting.
- Solid Infrastructure as Code experience with Terraform, Bicep, or equivalent tools.
- Production experience with Azure cloud services: AKS, ACR, Key Vault, Azure Monitor, Virtual Networks, Private Endpoints, and Azure Policy.
- Strong CI/CD experience: GitHub Actions (self-hosted runners, reusable workflows), ArgoCD, or similar GitOps tooling.
- Proficiency in Python for automation, scripting, and tooling.
- Experience with container security: image scanning, runtime security, network policies, and least-privilege patterns.
- Experience with observability stack: Prometheus, Grafana, centralized logging, and alerting configuration.
- Familiarity with async task processing: Celery, Redis, or equivalent message queue patterns.
- Strong Linux systems administration and networking fundamentals.
- Operational mindset with strong troubleshooting skills across infrastructure layers.
- Comfortable in agile, iterative delivery environments with ownership and accountability.
- Clear communicator and collaborator across global, cross-functional stakeholders.
- Strong focus on reliability and automation: you measure success by system uptime and reduced manual toil.
- Proactive learner with pragmatic adoption of AI-assisted developer tools (GitHub Copilot, Claude Code) to improve automation and delivery.
- Experience supporting AI/ML infrastructure: GPU scheduling, model serving platforms, or ML pipeline orchestration.
- Service mesh experience (Istio, Linkerd) for traffic management and security.
- Experience with Databricks or similar data platform infrastructure.
- Familiarity with workflow orchestration (Temporal, Airflow) for complex AI pipelines.
- Experience with cost optimization: FinOps practices, resource right-sizing, and reserved capacity planning.
- Experience in regulated environments where auditability and secure-by-default infrastructure are essential.
- Certifications: CKA/CKAD, Azure Administrator, or Terraform Associate.
What we offer
Our employees play an integral part in our success as a business. We appreciate that each of our employees are unique and have unique needs, ambitions and we enjoy being a part of their journey. We are there to empower and encourage you with your personal and professional development ensuring that you take control by offering a large variety of courses and targeted development programs.
All that in a global environment where international mobility and career progression are encouraged. Caring for your health and wellbeing is key priority for us. This is why we build Work Well programs to providing you with peace of mind and give the flexibility in planning and arranging for a better work-life balance.
90206 | Data & AI | Professional | Allianz Partners | Full-Time | Permanent
Allianz Group is one of the most trusted insurance and asset management companies in the world. Caring for our employees, their ambitions, dreams and challenges, is what makes us a unique employer. Together we can build an environment where everyone feels empowered and has the confidence to explore, to grow and to shape a better future for our customers and the world around us.
We at Allianz believe in a diverse and inclusive workforce and are proud to be an equal opportunity employer. We encourage you to bring your whole self to work, no matter where you are from, what you look like, who you love or what you believe in.We therefore welcome applications regardless of ethnicity or cultural background, age, gender, nationality, religion, disability or sexual orientation. Great to have you on board. Let's care for tomorrow.
Note: Diversity of minds is an integral part of Allianz’ company culture. One means to achieve diverse teams is a regular rotation of Allianz Executive employees across functions, Allianz entities and geographies. Therefore, the company encourages its employees to have motivation in gaining varied skills from different positions and to collect experiences from across Allianz Group.

