LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Senior Linux Platform Engineer

at Susquehanna

Back to all Python jobs
Susquehanna logo
Proprietary Trading

Senior Linux Platform Engineer

at Susquehanna

Tech LeadNo visa sponsorshipPython

Posted 3 days ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Not specified
Country
Not specified

We are seeking a highly technical Senior Platform Engineer with deep expertise in Linux Engineering, OpenStack development, Kubernetes, and GPU-enabled infrastructure to design, build, and operate SIG’s next-generation infrastructure platforms supporting trading and core technology environments. This is a hands-on engineering role focused on building and tuning scalable, resilient, and high-performance infrastructure across CPU and GPU workloads, requiring strong Linux internals knowledge and cloud-native platform experience. You will develop automation, build and maintain infrastructure tooling, and implement high-availability, disaster recovery, multi-tenant OpenStack and Kubernetes platforms, integrated with CI/CD pipelines and GitOps workflows.

JOB DESCRIPTION
Overview

We are seeking a highly technical Senior Platform Engineer with deep expertise in Linux Engineering, OpenStack development, Kubernetes, and GPU-enabled infrastructure to design, build, and operate SIG’s next-generation infrastructure platforms supporting trading and core technology environments.

  • This is a hands-on engineering role focused on building and tuning scalable, resilient, and high-performance infrastructure systems across CPU and GPU workloads. The ideal candidate will have strong Linux internals knowledge, experience developing and operating cloud-native platforms, and a deep understanding of distributed systems architecture, including the efficient provisioning, isolation, and performance tuning of accelerator-based compute resources.

What we're looking for

Linux Systems Engineering

  • Deep troubleshooting across kernel, networking stack, storage, and performance layers.
  • Performance tuning for low-latency systems (CPU pinning, NUMA, IRQ balancing, kernel tuning).
  • Develop automation using Python, Go, or similar languages.
  • Build and maintain infrastructure tooling and internal platform services.
  • Implement high-availability solutions and disaster recovery strategies.
  • Perform root cause analysis for production incidents affecting distributed systems.
  • Design, deploy, and operate GPU-enabled infrastructure. Optimize GPU utilization (memory bandwidth, PCIe throughput, multi-process service, MIG partitioning where applicable).
  • Tune workloads to efficiently leverage NVIDIA GPUs (or equivalent accelerators) for compute-intensive applications.
  • Troubleshoot GPU driver, CUDA, kernel module, and firmware-related issues in production environments.

OpenStack Development & Cloud Infrastructure

  • Develop and extend OpenStack services (Nova, Neutron, Cinder, Keystone, etc.).
  • Build custom integrations and automation around OpenStack APIs.
  • Optimize compute, networking, and storage performance for high-performance workloads.
  • Design multi-tenant OpenStack architectures with strong isolation and security.
  • Contribute to infrastructure-as-code frameworks managing OpenStack environments.
  • Debug and resolve deep issues across hypervisors (KVM), networking layers, and control plane services.
  • Integrate OpenStack environments with Kubernetes platforms (hybrid cloud architectures).

Kubernetes Platform Engineering

  • Design, build, and operate highly available, production-grade Kubernetes clusters.
  • Develop and maintain Kubernetes operators, controllers, and custom resource definitions (CRDs).
  • Implement advanced scheduling, multi-tenancy, and workload isolation strategies.
  • Optimize cluster performance for low-latency and high-throughput workloads.
  • Integrate Kubernetes with CI/CD pipelines and GitOps workflows.
  • Implement cluster observability using Prometheus, Grafana, OpenTelemetry, etc.
  • Design and enforce networking policies (CNI), ingress architecture.
  • Implement secure cluster design including RBAC, OPA/Gatekeeper, secrets management, and runtime security.

Automation & Infrastructure as Code

  • Design and maintain infrastructure using Terraform, Ansible, Helm, or similar tools.
  • Build CI/CD pipelines for infrastructure and platform deployments.
  • Implement immutable infrastructure and GitOps methodologies.
  • Create automated validation, testing, and deployment frameworks for platform services.

Required Technical Skills

  • Advanced Linux systems knowledge (kernel, networking, storage)
  • Experience deploying and operating GPU-enabled Linux servers
  • Understanding of CUDA drivers, GPU kernel modules
  • Performance profiling and Tuning Workloads for compute-intensive applications.
  • Hands-on OpenStack development and operations experience
  • Strong experience administering and engineering production Kubernetes clusters
  • Strong understanding of distributed systems principles:
    • Consensus
    • Replication
    • Fault tolerance
    • CAP theorem tradeoffs
  • Experience with 
    • Python or similar programming languages
    • Infrastructure as Code (Terraform, Ansible)
    • Container runtimes (containerd, CRI-O)
    • Observability stacks (Prometheus, Grafana, ELK)

Desirable Experience

  • Experience in low-latency or high-performance trading environments
  • High-performance networking (DPDK, SR-IOV, CNI tuning)
  • Storage systems (Ceph, distributed storage, NVMe optimization)
  • Contribution to open-source projects (Kubernetes, OpenStack)
  • Experience designing multi-region or hybrid cloud architectures
  •  Experience tuning AI/ML, quantitative, or high-performance compute workloads on GPUs
  • Experience with NVIDIA DCGM, MIG (Multi-Instance GPU), or vGPU configurations
  • Familiarity with RDMA, GPUDirect, or high-throughput interconnects
  • Experience optimizing containerized ML or compute pipelines

Key Attributes

  • Strong systems thinking and deep technical curiosity
  • Ability to diagnose complex cross-layer failures
  • Passion for building reliable, scalable distributed systems
  • Comfortable operating in high-availability, high-performance production environments
  • Strong documentation and knowledge-sharing mindset

Senior Linux Platform Engineer

at Susquehanna

Back to all Python jobs
Susquehanna logo
Proprietary Trading

Senior Linux Platform Engineer

at Susquehanna

Tech LeadNo visa sponsorshipPython

Posted 3 days ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Not specified
Country
Not specified

We are seeking a highly technical Senior Platform Engineer with deep expertise in Linux Engineering, OpenStack development, Kubernetes, and GPU-enabled infrastructure to design, build, and operate SIG’s next-generation infrastructure platforms supporting trading and core technology environments. This is a hands-on engineering role focused on building and tuning scalable, resilient, and high-performance infrastructure across CPU and GPU workloads, requiring strong Linux internals knowledge and cloud-native platform experience. You will develop automation, build and maintain infrastructure tooling, and implement high-availability, disaster recovery, multi-tenant OpenStack and Kubernetes platforms, integrated with CI/CD pipelines and GitOps workflows.

JOB DESCRIPTION
Overview

We are seeking a highly technical Senior Platform Engineer with deep expertise in Linux Engineering, OpenStack development, Kubernetes, and GPU-enabled infrastructure to design, build, and operate SIG’s next-generation infrastructure platforms supporting trading and core technology environments.

  • This is a hands-on engineering role focused on building and tuning scalable, resilient, and high-performance infrastructure systems across CPU and GPU workloads. The ideal candidate will have strong Linux internals knowledge, experience developing and operating cloud-native platforms, and a deep understanding of distributed systems architecture, including the efficient provisioning, isolation, and performance tuning of accelerator-based compute resources.

What we're looking for

Linux Systems Engineering

  • Deep troubleshooting across kernel, networking stack, storage, and performance layers.
  • Performance tuning for low-latency systems (CPU pinning, NUMA, IRQ balancing, kernel tuning).
  • Develop automation using Python, Go, or similar languages.
  • Build and maintain infrastructure tooling and internal platform services.
  • Implement high-availability solutions and disaster recovery strategies.
  • Perform root cause analysis for production incidents affecting distributed systems.
  • Design, deploy, and operate GPU-enabled infrastructure. Optimize GPU utilization (memory bandwidth, PCIe throughput, multi-process service, MIG partitioning where applicable).
  • Tune workloads to efficiently leverage NVIDIA GPUs (or equivalent accelerators) for compute-intensive applications.
  • Troubleshoot GPU driver, CUDA, kernel module, and firmware-related issues in production environments.

OpenStack Development & Cloud Infrastructure

  • Develop and extend OpenStack services (Nova, Neutron, Cinder, Keystone, etc.).
  • Build custom integrations and automation around OpenStack APIs.
  • Optimize compute, networking, and storage performance for high-performance workloads.
  • Design multi-tenant OpenStack architectures with strong isolation and security.
  • Contribute to infrastructure-as-code frameworks managing OpenStack environments.
  • Debug and resolve deep issues across hypervisors (KVM), networking layers, and control plane services.
  • Integrate OpenStack environments with Kubernetes platforms (hybrid cloud architectures).

Kubernetes Platform Engineering

  • Design, build, and operate highly available, production-grade Kubernetes clusters.
  • Develop and maintain Kubernetes operators, controllers, and custom resource definitions (CRDs).
  • Implement advanced scheduling, multi-tenancy, and workload isolation strategies.
  • Optimize cluster performance for low-latency and high-throughput workloads.
  • Integrate Kubernetes with CI/CD pipelines and GitOps workflows.
  • Implement cluster observability using Prometheus, Grafana, OpenTelemetry, etc.
  • Design and enforce networking policies (CNI), ingress architecture.
  • Implement secure cluster design including RBAC, OPA/Gatekeeper, secrets management, and runtime security.

Automation & Infrastructure as Code

  • Design and maintain infrastructure using Terraform, Ansible, Helm, or similar tools.
  • Build CI/CD pipelines for infrastructure and platform deployments.
  • Implement immutable infrastructure and GitOps methodologies.
  • Create automated validation, testing, and deployment frameworks for platform services.

Required Technical Skills

  • Advanced Linux systems knowledge (kernel, networking, storage)
  • Experience deploying and operating GPU-enabled Linux servers
  • Understanding of CUDA drivers, GPU kernel modules
  • Performance profiling and Tuning Workloads for compute-intensive applications.
  • Hands-on OpenStack development and operations experience
  • Strong experience administering and engineering production Kubernetes clusters
  • Strong understanding of distributed systems principles:
    • Consensus
    • Replication
    • Fault tolerance
    • CAP theorem tradeoffs
  • Experience with 
    • Python or similar programming languages
    • Infrastructure as Code (Terraform, Ansible)
    • Container runtimes (containerd, CRI-O)
    • Observability stacks (Prometheus, Grafana, ELK)

Desirable Experience

  • Experience in low-latency or high-performance trading environments
  • High-performance networking (DPDK, SR-IOV, CNI tuning)
  • Storage systems (Ceph, distributed storage, NVMe optimization)
  • Contribution to open-source projects (Kubernetes, OpenStack)
  • Experience designing multi-region or hybrid cloud architectures
  •  Experience tuning AI/ML, quantitative, or high-performance compute workloads on GPUs
  • Experience with NVIDIA DCGM, MIG (Multi-Instance GPU), or vGPU configurations
  • Familiarity with RDMA, GPUDirect, or high-throughput interconnects
  • Experience optimizing containerized ML or compute pipelines

Key Attributes

  • Strong systems thinking and deep technical curiosity
  • Ability to diagnose complex cross-layer failures
  • Passion for building reliable, scalable distributed systems
  • Comfortable operating in high-availability, high-performance production environments
  • Strong documentation and knowledge-sharing mindset