
Global Banking & Markets, Senior Low-Level Systems Engineer (Linux Kernel and Low Latency), Vice President , Hong Kong
at Goldman Sachs
Posted 18 hours ago
No clicks
- Compensation
- Not specified
- City
- Hong Kong
- Country
- China
Currency: Not specified
Senior low-level systems engineer role focused on designing and optimizing high-performance, low-latency systems on Linux across kernel and user space. You will implement and tune networking pipelines (UDP/TCP, multicast, DPDK, XDP), optimize for determinism (CPU isolation, NUMA, cache awareness), and profile/debug using tools like perf, ftrace, and eBPF. The role involves close collaboration with product, infrastructure and operations to deliver production systems with clear SLOs, benchmarks and runbooks. Financial markets experience is a plus; this position is Vice President level based in Hong Kong.
Overview
We are seeking an experienced developer passionate about low-level coding to design and optimize high-performance, mission-critical systems on Linux. You will work close to the metal across kernel and user space, focusing on performance, reliability, and deterministic latency. Experience in financial markets is a strong plus, but not required.
What You Will Do
- Design, build, and optimize high-performance services in: C, C++, and Java.
- Engineer performance-critical components across user space and kernel interfaces, emphasizing memory, scheduling, I/O, and networking paths.
- Develop and tune networking pipelines, including multicast and unicast UDP and TCP, with careful socket, buffer, and NIC configuration.
- Implement kernel-bypass or fast-path networking where appropriate (for example, DPDK, netmap, XDP), including queue, NIC offload, and CPU affinity strategies.
- Contribute to or interact with Linux kernel subsystems: memory management, scheduler, device drivers, and filesystems (VFS).
- Optimize systems for latency and determinism: CPU isolation, thread pinning, NUMA locality, cache awareness, lock contention reduction, and memory allocator tuning.
- Apply distributed systems patterns such as sequencer (virtual synchrony) for total ordering and consistency where required.
- Profile, measure, and debug using tools such as perf, ftrace, eBPF, perf_events, tcpdump, and flame graphs to find and eliminate bottlenecks.
- Collaborate cross-functionally with product, infra, and operations to deliver robust production systems with clear SLOs and runbooks.
- Uphold high code quality through reviews, benchmarks, reproducible performance tests, and documentation.
How You Will Work
- You value measurement over assumption, using benchmarks and profiles to drive decisions.
- You communicate tradeoffs clearly (throughput vs. latency, CPU vs. memory, complexity vs. resilience).
- You write clear documentation, reproducible test harnesses, and actionable runbooks.
- You collaborate closely, give and receive constructive code reviews, and mentor peers.
Minimum Qualifications
- Expert-level proficiency in at least one of: C, C++, or Java.
- Excellent understanding of Linux kernel internals, including:
- Memory management and allocators.
- Device driver model and driver interactions.
- Scheduler behavior and tuning.
- Filesystems and VFS concepts.
- Depth in at least one of the following tracks (pick one or more):
- High-performance networking:
- Strong practical experience with UDP (multicast and unicast) and TCP networking.
- Socket options, buffer sizing, epoll, busy polling, NAPI, NIC queues, and RSS.
- Familiarity with kernel-bypass mechanics (for example, DPDK, netmap) and XDP.
- Low-latency engineering:
- End-to-end latency optimization, jitter reduction, and deterministic execution.
- CPU pinning, interrupt affinity, NUMA, cache friendliness, lock-free or wait-free techniques, careful memory management, and lightweight logging.
- Distributed systems (sequencer and virtual synchrony patterns):
- Sequencer-based total order broadcast, membership, failure handling, and consistency guarantees.
- Tradeoffs between latency, throughput, ordering, and availability in practical systems.
- High-performance networking:
- Strong debugging and profiling skills on Linux, including tool-driven investigations.
- Solid understanding of concurrency, synchronization primitives, and memory models.
Preferred Qualifications
- Experience in financial markets and trading infrastructures (for example, market data, order routing, exchange connectivity, FIX or exchange-native protocols).
- Familiarity with time synchronization for low-latency environments (for example, PTP).
- Experience with eBPF and tracing for observability in production.
- Knowledge of RDMA or kernel networking internals beyond the socket API.
- Experience writing or maintaining kernel modules or device drivers.
- Familiarity with NIC offloads and tuning (TSO, LRO, RFS/RPS, interrupt moderation).
- Experience with deterministic GC tuning (if using Java) and low-latency JVM practices.
- Benchmarking methodology knowledge: workload design, repeatability, variance analysis, and flame graph interpretation












