Hedge Funds

Senior Full Stack Data Platform Engineer

at Millennium

Mid LevelNo visa sponsorshipData Engineering

Posted 4 hours ago

No clicks

Compensation: Not specified
City: Not specified
Country: Not specified

Senior engineer responsible for architecting and building high-throughput, low-latency data platforms and pipelines to ingest and normalize unstructured text, audio, and video while enabling real-time ML/AI inference. Work across Python, Java, and C++ to develop scalable backend microservices, storage strategies (vector DBs and distributed file systems), and consumer-facing APIs that connect ML research to trading decisions.

Senior Full Stack Data Platform Engineer

Founded in 1989, Millennium is a global alternative investment management firm. Millennium seeks to pursue a diverse array of investment strategies across industry sectors, asset classes and geographies. The firm’s primary investment areas are Fundamental Equity, Equity Arbitrage, Fixed Income, Commodities and Quantitative Strategies. We solve hard and interesting problems at the intersection of computer science, finance, and mathematics. We are focused on innovating and rapidly applying innovations to real world scenarios. This enables engineers to work on interesting problems, learn quickly and have deep impact to the firm and the business.

At Millennium, we are redefining how investment decisions are made. We don't just look at balance sheets; we harness the chaos of the real world. By analyzing vast amounts of unstructured data—from news briefings and earnings call audio to regulatory documents—we provide our Portfolio Managers (PMs) with the "informational edge" (Alpha) they need to outperform the market.

The Role

We are seeking a Senior Full Stack Software Engineer with deep expertise in building high-throughput data platforms.

In this role, you will architect scalable data platforms using Python, Java, C++, build robust APIs, and enable processing of data using genAI techniques. You will build and optimize a config-driven, plugin-enabled data platform that will allow the construction of DAGs for data processing. You will then apply the platform to build reusable components and pipelines that will ingest gigabytes of unstructured text, audio, and video. You will enable a variety of rich data consumption use-cases by building the right abstractions and APIs for data consumers.

You will be the bridge between complex ML research and real-time trading decisions, working in a poly-language environment (Python, Java, C++) where performance is paramount.

Key Responsibilities

High-Performance Data Pipelines: Architect low-latency, high-throughput platform that enables rapid development of pipelines to ingest and normalize unstructured data (PDFs, news feeds, audio streams).
AI & ML Integration: Build the infrastructure that wraps and serves NLP and ML models. You will ensure that model inference happens in real-time within the data stream.
Backend Microservices: Develop robust backend services to handle metadata management, search, and retrieval of processed alternative data.
System Optimization: Tune the platform for speed. In financial markets, milliseconds matter; you will optimize database queries, serialization, and network calls to ensure data reaches the PMs instantly.
Data Strategy: Implement storage strategies for unstructured data, utilizing Vector Databases for semantic search and Distributed File Systems for raw storage.

Required Qualifications

Data Platform Experience: Minimum 5+ years of software engineering experience, preferably building data platforms.
Core Languages: Strong proficiency in both Python and Java/C++ is required. You should be comfortable switching between these languages for different use cases (e.g., Python for data processing, Java or C++ for high-concurrency, scalable services).
Data Engineering: Proven experience building data pipelines, ETL processes, or working with big data frameworks (e.g., Kafka, Airflow, Apache Parquet, Arrow, Iceberg , KDB etc).
Unstructured Data Expertise: Proven experience working with unstructured data types (Text, Audio, Documents). Familiarity with techniques such as OCR, transcription normalization, text extraction.
Database Knowledge: Proficiency in SQL and significant experience with search/NoSQL engines (Elasticsearch, Redis, Solr, MongoDB or equivalent).
Cloud Native: Experience building serverless data lakes or processing pipelines on AWS/GCP, etc

Preferred Qualifications

AI/NLP Exposure: Experience working with Large Language Models (LLMs), Vector Databases (Pinecone, Milvus, Weaviate), or NLP libraries (Hugging Face, spaCy) or similar
Frontend Competence: Solid experience with modern frontend frameworks (React, Vue, or Angular) and data visualization libraries (e.g., D3.js, Highcharts, or AG Grid).
Financial Knowledge: Understanding of financial instruments (Equities, Fixed Income) or the investment lifecycle.
Document Processing: Familiarity with parsing complex document structures (Earnings calls transcripts, 10-K/10-Q filings, Broker Research, Sector and Industry Reports, Central Bank documents, news wires, social media, etc).

Back to all Data Engineering jobs

Apply now

Hedge Funds

Senior Full Stack Data Platform Engineer

at Millennium

Mid LevelNo visa sponsorshipData Engineering

Posted 4 hours ago

No clicks

Compensation: Not specified
City: Not specified
Country: Not specified

Senior Full Stack Data Platform Engineer

The Role

We are seeking a Senior Full Stack Software Engineer with deep expertise in building high-throughput data platforms.

You will be the bridge between complex ML research and real-time trading decisions, working in a poly-language environment (Python, Java, C++) where performance is paramount.

Key Responsibilities

High-Performance Data Pipelines: Architect low-latency, high-throughput platform that enables rapid development of pipelines to ingest and normalize unstructured data (PDFs, news feeds, audio streams).
AI & ML Integration: Build the infrastructure that wraps and serves NLP and ML models. You will ensure that model inference happens in real-time within the data stream.
Backend Microservices: Develop robust backend services to handle metadata management, search, and retrieval of processed alternative data.
System Optimization: Tune the platform for speed. In financial markets, milliseconds matter; you will optimize database queries, serialization, and network calls to ensure data reaches the PMs instantly.
Data Strategy: Implement storage strategies for unstructured data, utilizing Vector Databases for semantic search and Distributed File Systems for raw storage.

Required Qualifications

Data Platform Experience: Minimum 5+ years of software engineering experience, preferably building data platforms.
Core Languages: Strong proficiency in both Python and Java/C++ is required. You should be comfortable switching between these languages for different use cases (e.g., Python for data processing, Java or C++ for high-concurrency, scalable services).
Data Engineering: Proven experience building data pipelines, ETL processes, or working with big data frameworks (e.g., Kafka, Airflow, Apache Parquet, Arrow, Iceberg , KDB etc).
Unstructured Data Expertise: Proven experience working with unstructured data types (Text, Audio, Documents). Familiarity with techniques such as OCR, transcription normalization, text extraction.
Database Knowledge: Proficiency in SQL and significant experience with search/NoSQL engines (Elasticsearch, Redis, Solr, MongoDB or equivalent).
Cloud Native: Experience building serverless data lakes or processing pipelines on AWS/GCP, etc

Preferred Qualifications

AI/NLP Exposure: Experience working with Large Language Models (LLMs), Vector Databases (Pinecone, Milvus, Weaviate), or NLP libraries (Hugging Face, spaCy) or similar
Frontend Competence: Solid experience with modern frontend frameworks (React, Vue, or Angular) and data visualization libraries (e.g., D3.js, Highcharts, or AG Grid).
Financial Knowledge: Understanding of financial instruments (Equities, Fixed Income) or the investment lifecycle.
Document Processing: Familiarity with parsing complex document structures (Earnings calls transcripts, 10-K/10-Q filings, Broker Research, Sector and Industry Reports, Central Bank documents, news wires, social media, etc).