LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Software Engineer - AI Platform Services

at Millennium

Back to all Python jobs
Millennium logo
Hedge Funds

Software Engineer - AI Platform Services

at Millennium

Tech LeadNo visa sponsorshipPython

Posted a month ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Not specified
Country
Not specified

Senior engineering role to design, build, and operate the core service layer for an internal AI/agent platform, including an AI gateway, model/provider routing, policy/guardrails, tool-execution interfaces, high-throughput async APIs, and observability. You will implement MCP capabilities to enable secure, governed connectivity between agent runtimes and enterprise tools/data and partner closely with AI Engineers to support agent workflows. Responsibilities include production-grade Kubernetes operations, autoscaling, SLO/SLI and incident management, CI/CD and IaC-driven delivery, and influencing platform roadmap and technical strategy. The role emphasizes reliability, security, scalability, and developer experience for firm-wide internal AI capabilities.

Software Engineer - AI Platform Services

We’re a high-impact platform team building the firm’s internal AI platform that bridges traditional enterprise platforms (identity, data, workflow, governance) with GenAI tools (agents, copilots, model providers).

This is a senior engineering role focused on designing and owning the core service layer that agentic tools run on: an AI gateway, model/provider routing, policy/guardrails, tool-execution interfaces, high-throughput async APIs, and production-grade observability.. MCP (Model Context Protocol) services are part of the platform portfolio—enabling secure, governed connectivity between agent runtimes and enterprise tools/data. You’ll partner closely with AI Engineers building agent workflows—your focus is to make the underlying platform fast, reliable, secure, and easy to build on.

Key Responsibilities

  • Design, build, and operate core platform services (Python; REST + async; streaming where appropriate) powering firm-wide internal AI/agentic capabilities.

  • Own gateway/platform concerns end-to-end: routing, timeouts/retries, streaming, request shaping, rate limits/quotas, multi-tenancy, policy enforcement, provider abstraction, safe degradation, and robust client experience.

  • Build and operate MCP capabilities as part of the platform.

  • Build for scale and availability on Kubernetes: autoscaling, rollout strategies, capacity planning, performance tuning, and production debugging.

  • Raise reliability practices: define and manage SLOs/SLIs, instrumentation standards, incident response/runbooks, post-incident follow-ups, load/resilience testing, and operational excellence.

  • Improve delivery safety: CI/CD, environment promotion, IaC-driven repeatability, and secure SDLC practices.

  • Influence roadmap and technical strategy: prioritize foundational investments and reduce platform risk for a business-critical internal platform.

Required Qualifications

  • 7+ years of professional software engineering experience (or equivalent practical experience)

  • Strong expertise in Python, Java, or Go, including async patterns, concurrency, and building high-throughput services (FastAPI or similar).

  • Solid distributed systems fundamentals: idempotency, backpressure, failure isolation, consistency tradeoffs, rate limiting, retries/timeouts.

  • Production experience operating services on Kubernetes (deployments, autoscaling, debugging, observability, performance).

  • Basic familiarity with LLM integration patterns (streaming responses, tool/function calling)

  • Demonstrated design leadership (RFCs, architecture reviews, leading cross-team initiatives).

  • Excellent communication skills—able to translate technical tradeoffs to stakeholders and partner teams.

Preferred Qualifications

  • Experience with service-to-service authentication patterns (API keys, OAuth/JWT, mTLS concepts).

  • Familiarity with observability tooling (structured logs, metrics, tracing; Datadog or OpenTelemetry a plus).

  • Strong fundamentals in AWS (or GCP/Azure) relevant to secure platforms (IAM, networking basics, compute, logging/monitoring patterns).

  • Working proficiency with Terraform and automation-first operations (repeatable environments, policy checks, safe rollouts).

  • Comfort using AI dev tools (Claude Code, Cursor, Gemini CLI) responsibly (tests, validation, secure coding).

Software Engineer - AI Platform Services

at Millennium

Back to all Python jobs
Millennium logo
Hedge Funds

Software Engineer - AI Platform Services

at Millennium

Tech LeadNo visa sponsorshipPython

Posted a month ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Not specified
Country
Not specified

Senior engineering role to design, build, and operate the core service layer for an internal AI/agent platform, including an AI gateway, model/provider routing, policy/guardrails, tool-execution interfaces, high-throughput async APIs, and observability. You will implement MCP capabilities to enable secure, governed connectivity between agent runtimes and enterprise tools/data and partner closely with AI Engineers to support agent workflows. Responsibilities include production-grade Kubernetes operations, autoscaling, SLO/SLI and incident management, CI/CD and IaC-driven delivery, and influencing platform roadmap and technical strategy. The role emphasizes reliability, security, scalability, and developer experience for firm-wide internal AI capabilities.

Software Engineer - AI Platform Services

We’re a high-impact platform team building the firm’s internal AI platform that bridges traditional enterprise platforms (identity, data, workflow, governance) with GenAI tools (agents, copilots, model providers).

This is a senior engineering role focused on designing and owning the core service layer that agentic tools run on: an AI gateway, model/provider routing, policy/guardrails, tool-execution interfaces, high-throughput async APIs, and production-grade observability.. MCP (Model Context Protocol) services are part of the platform portfolio—enabling secure, governed connectivity between agent runtimes and enterprise tools/data. You’ll partner closely with AI Engineers building agent workflows—your focus is to make the underlying platform fast, reliable, secure, and easy to build on.

Key Responsibilities

  • Design, build, and operate core platform services (Python; REST + async; streaming where appropriate) powering firm-wide internal AI/agentic capabilities.

  • Own gateway/platform concerns end-to-end: routing, timeouts/retries, streaming, request shaping, rate limits/quotas, multi-tenancy, policy enforcement, provider abstraction, safe degradation, and robust client experience.

  • Build and operate MCP capabilities as part of the platform.

  • Build for scale and availability on Kubernetes: autoscaling, rollout strategies, capacity planning, performance tuning, and production debugging.

  • Raise reliability practices: define and manage SLOs/SLIs, instrumentation standards, incident response/runbooks, post-incident follow-ups, load/resilience testing, and operational excellence.

  • Improve delivery safety: CI/CD, environment promotion, IaC-driven repeatability, and secure SDLC practices.

  • Influence roadmap and technical strategy: prioritize foundational investments and reduce platform risk for a business-critical internal platform.

Required Qualifications

  • 7+ years of professional software engineering experience (or equivalent practical experience)

  • Strong expertise in Python, Java, or Go, including async patterns, concurrency, and building high-throughput services (FastAPI or similar).

  • Solid distributed systems fundamentals: idempotency, backpressure, failure isolation, consistency tradeoffs, rate limiting, retries/timeouts.

  • Production experience operating services on Kubernetes (deployments, autoscaling, debugging, observability, performance).

  • Basic familiarity with LLM integration patterns (streaming responses, tool/function calling)

  • Demonstrated design leadership (RFCs, architecture reviews, leading cross-team initiatives).

  • Excellent communication skills—able to translate technical tradeoffs to stakeholders and partner teams.

Preferred Qualifications

  • Experience with service-to-service authentication patterns (API keys, OAuth/JWT, mTLS concepts).

  • Familiarity with observability tooling (structured logs, metrics, tracing; Datadog or OpenTelemetry a plus).

  • Strong fundamentals in AWS (or GCP/Azure) relevant to secure platforms (IAM, networking basics, compute, logging/monitoring patterns).

  • Working proficiency with Terraform and automation-first operations (repeatable environments, policy checks, safe rollouts).

  • Comfort using AI dev tools (Claude Code, Cursor, Gemini CLI) responsibly (tests, validation, secure coding).