Industry not specified

Principal Software Engineer

at Microsoft

Tech LeadNo visa sponsorshipData Engineering

Posted 5 hours ago

No clicks

Compensation: Not specified
City: Not specified
Country: Not specified

Lead the design and implementation of real-time streaming ETL and feature pipelines (Flink/Spark) to feed online stores, caches, and ML inference serving. Build and operate reliable messaging with Kafka/Pulsar and own data contracts, backfill workflows, and SLOs with strong observability. Optimize end-to-end performance and cost across compute, storage, and serving, and collaborate with applied scientists on feature/embedding definitions and validation. Ship CI/CD, testing, and incident response practices to maintain production quality.

Overview

Modern ads platforms run on always-on, real-time data: streaming events, feature computation, near-real-time aggregations, and low-latency serving to power ML models that operate at massive scale under strict freshness, cost, and reliability requirements.

Microsoft Ads builds and operates large-scale, latency-sensitive systems that serve billions of requests. We are looking for a Principal Software Engineer who is hands-on with production coding and system design to build the real-time data pipelines and feature/embedding materialization systems that feed online stores/caches and integrate tightly with ML inference serving.

This role is ideal for engineers who enjoy:

building robust streaming + ETL systems (correctness, idempotency, backfills, late data),
owning SLOs with strong observability and operational maturity,
and optimizing end-to-end performance and cost across compute, storage, and serving integrations.

Primary success metrics are freshness, correctness, latency, reliability, and cost in production.

Responsibilities

Design and implement real-time streaming ETL / feature pipelines (e.g., Flink or Spark Structured Streaming) that meet strict freshness and correctness constraints.
Build and operate reliable messaging and ingestion with Kafka/Pulsar (partitioning strategy, retries, ordering guarantees, DLQs, backpressure handling).
Own data contracts between producers, pipelines, and consumers: schema evolution, versioning, compatibility, validation, and safe rollout.
Implement production-grade backfill/replay workflows
Define and meet SLOs using OpenTelemetry/Prometheus/Grafana for metrics, tracing, dashboards, alerting, and incident response readiness.
Integrate pipelines with online stores/caches and ML consumers (feature stores, embedding pipelines, LLM API calls, online/offline consistency patterns).
Partner with applied scientists on feature/embedding definitions, validation, and end-to-end quality measurement.
Optimize end-to-end performance and efficiency: CPU/memory/I/O, serialization, caching, network overhead, concurrency, and pipeline compute cost.
Contribute to serving/inference integrations where needed (e.g., Triton/ONNX Runtime/TensorRT) including batching and latency/cost tradeoffs.
Ship safely with CI/CD, automated testing (unit/integration/data quality), and operational playbooks/runbooks.

Qualifications

Required Qualification:

Bachelor’s or Master’s degree in Computer Science, Electrical/Computer Engineering, or a related field, with 8+ years of related experience.
Strong programming skills in language C++,C# or Python (at least one required).
Hands-on experience in one or more:
- Building and operating streaming data pipelines in production (Flink or Spark Structured Streaming),
- Distributed systems engineering with strong reliability and operational rigor,
- Messaging systems such as Kafka/Pulsar.
Experience operating services with Kubernetes/containers and production readiness practices (deployments, scaling, rollbacks).
Experience with observability stacks such as OpenTelemetry, Prometheus, Grafana.
Ability to debug complex production issues using logs/metrics/traces and performance profiling.
Strong communication and collaboration skills, with experience working across engineering, applied science/ML, and product/business stakeholders.

Preferred Qualifications:

Experience with feature stores, embedding pipelines, and online/offline consistency (freshness guarantees, correctness validation).
Experience with data lakehouse/table formats and optimizations eg partitioning, compaction, and incremental processing.
Experience with GPU inference serving (Triton, ONNX Runtime/TensorRT) and performance techniques (batching, request shaping, tail-latency reduction).
understanding of pipeline correctness patterns: idempotency, dedup, watermarking, late data, exactly-once vs at-least-once tradeoffs.
Background in cost/performance modeling, capacity planning, and reliability improvements for high-scale data platforms.
Experience in Ads/search/recommendations or other high-scale systems where freshness, latency, and cost are jointly optimized.

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Back to all Data Engineering jobs

Apply now