LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Senior System Software Engineer - Dynamo

at Nvidia

Back to all Rust jobs
N
Industry not specified

Senior System Software Engineer - Dynamo

at Nvidia

Mid LevelNo visa sponsorshipRust

Posted 11 hours ago

No clicks

Compensation
$152,000 – $287,500 USD

Currency: $ (USD)

City
Not specified
Country
United States

In this role, you will develop open source software to serve inference of trained AI models running on GPUs. You will contribute to the development of disaggregated serving for Dynamo-supported inference engines (vLLM, SGLang, TRT-LLM) and expand to support multi-modal models for embedding disaggregation. You will innovate in managing large KV caches across heterogeneous memory and storage hierarchies using the NVIDIA Optimized Transfer Library (NIXL) for low-latency data movement. You will design and optimize distributed inference components in Rust and Python within the Dynamo Rust Runtime Core Library while balancing performance, scalability, and integration of open source technologies.

We are now looking for a Senior System Software Engineer to work on Dynamo. NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team. Academic and commercial groups around the world are using GPUs to power a revolution in AI, enabling breakthroughs in problems from image classification to speech recognition to natural language processing. We are a fast-paced team building Generative AI inference platform to make design and deployment of new AI models easier and accessible to all users.

What you'll be doing:

In this role, you will develop open source software to serve inference of trained AI models running on GPUs. You will

  • Contribute to the development of disaggregated serving for Dynamo-supported inference engines (vLLM, SGLang, TRT-LLM) and expand to support multi-modal models for embedding disaggregation.

  • Innovate in the management and transfer of large KV caches across heterogeneous memory and storage hierarchies, using the NVIDIA Optimized Transfer Library (NIXL) for low-latency, cost-effective data movement.

  • Build new features to the Dynamo Rust Runtime Core Library and design, implement, and optimize distributed inference components in Rust and Python.

  • Balance a variety of objectives: build robust, scalable, high performance software components to support our distributed inference workloads; work with team leads to prioritize features and capabilities; load-balance asynchronous requests across available resources; optimize prediction throughput under latency constraints; and integrate the latest open source technology.

What we need to see:

  • Masters or PhD or equivalent experience

  • 3+ years in Computer Science, Computer Engineering, or related field

  • Ability to work in a fast-paced, agile team environment

  • Excellent Rust/Python/C++ programming and software design skills, including debugging, performance analysis, and test design.

  • Experience with high scale distributed systems and ML systems

Ways to stand out from the crowd:

  • Prior contributions to open-source AI inference frameworks (e.g., vLLM, TensorRT-LLM, SGLang).

  • Experience with GPU memory management, cache management, or high-performance networking.

  • Understanding of LLM-specific inference challenges, such as context window scaling and multi-model agentic and reasoning workflows.

  • Prior experience with disaggregated serving and multi modal models (Vision-Language models, Audio Language Models, Video Language Models)

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most expert and passionate people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you. Come help us build the real-time, efficient computing platform driving our success in the multifaceted and quickly growing field Deep Learning and Artificial Intelligence.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until February 14, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Senior System Software Engineer - Dynamo

at Nvidia

Back to all Rust jobs
N
Industry not specified

Senior System Software Engineer - Dynamo

at Nvidia

Mid LevelNo visa sponsorshipRust

Posted 11 hours ago

No clicks

Compensation
$152,000 – $287,500 USD

Currency: $ (USD)

City
Not specified
Country
United States

In this role, you will develop open source software to serve inference of trained AI models running on GPUs. You will contribute to the development of disaggregated serving for Dynamo-supported inference engines (vLLM, SGLang, TRT-LLM) and expand to support multi-modal models for embedding disaggregation. You will innovate in managing large KV caches across heterogeneous memory and storage hierarchies using the NVIDIA Optimized Transfer Library (NIXL) for low-latency data movement. You will design and optimize distributed inference components in Rust and Python within the Dynamo Rust Runtime Core Library while balancing performance, scalability, and integration of open source technologies.

We are now looking for a Senior System Software Engineer to work on Dynamo. NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team. Academic and commercial groups around the world are using GPUs to power a revolution in AI, enabling breakthroughs in problems from image classification to speech recognition to natural language processing. We are a fast-paced team building Generative AI inference platform to make design and deployment of new AI models easier and accessible to all users.

What you'll be doing:

In this role, you will develop open source software to serve inference of trained AI models running on GPUs. You will

  • Contribute to the development of disaggregated serving for Dynamo-supported inference engines (vLLM, SGLang, TRT-LLM) and expand to support multi-modal models for embedding disaggregation.

  • Innovate in the management and transfer of large KV caches across heterogeneous memory and storage hierarchies, using the NVIDIA Optimized Transfer Library (NIXL) for low-latency, cost-effective data movement.

  • Build new features to the Dynamo Rust Runtime Core Library and design, implement, and optimize distributed inference components in Rust and Python.

  • Balance a variety of objectives: build robust, scalable, high performance software components to support our distributed inference workloads; work with team leads to prioritize features and capabilities; load-balance asynchronous requests across available resources; optimize prediction throughput under latency constraints; and integrate the latest open source technology.

What we need to see:

  • Masters or PhD or equivalent experience

  • 3+ years in Computer Science, Computer Engineering, or related field

  • Ability to work in a fast-paced, agile team environment

  • Excellent Rust/Python/C++ programming and software design skills, including debugging, performance analysis, and test design.

  • Experience with high scale distributed systems and ML systems

Ways to stand out from the crowd:

  • Prior contributions to open-source AI inference frameworks (e.g., vLLM, TensorRT-LLM, SGLang).

  • Experience with GPU memory management, cache management, or high-performance networking.

  • Understanding of LLM-specific inference challenges, such as context window scaling and multi-model agentic and reasoning workflows.

  • Prior experience with disaggregated serving and multi modal models (Vision-Language models, Audio Language Models, Video Language Models)

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most expert and passionate people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you. Come help us build the real-time, efficient computing platform driving our success in the multifaceted and quickly growing field Deep Learning and Artificial Intelligence.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until February 14, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

SIMILAR OPPORTUNITIES

No similar jobs available at the moment.