LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Senior Software Research Architect, AI Networking

at Nvidia

Back to all Data Engineering jobs
N
Industry not specified

Senior Software Research Architect, AI Networking

at Nvidia

Mid LevelNo visa sponsorshipData Engineering

Posted 7 hours ago

No clicks

Compensation
Not specified USD

Currency: $ (USD)

City
Tel Aviv
Country
Israel

NVIDIA seeks a creative and practical Senior Software Architect to advance end-to-end AI networking for large-scale distributed training and inference. You will design, optimize, and deploy systems on NVIDIA Spectrum-X Networking Platform to manage inter-node communication, compute scheduling, and system-level optimization for generative AI workloads. The role involves prototyping, evaluating architectural improvements, collaborating across hardware, firmware, and software teams, and contributing to research with patents and conference publications.

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. Being an NVIDIAN means being part of a diverse and encouraging setting that encourages everyone to perform at their peak. Come join the team and discover how you can develop a lasting influence on the world.

NVIDIA is in search of a Senior Software Architect- a creative, forward-thinking, and practical researcher to improve the framework for widespread LLM learning and prediction. As part of our dynamic E2E Architecture group, you will design and optimize systems driving generative AI workloads, working at the intersection of software and hardware on some of the most advanced GPU clusters worldwide. You will define how AI models are deployed and scaled in production using the NVIDIA Spectrum-X Networking Platform, influencing decisions from inter-node communication and compute scheduling to system-level optimization. This is an opportunity to collaborate with best-in-class engineers and researchers and shape the future of generative AI in real-world applications. Your work will make a lasting impact by enabling generative AI technologies to reach real-world applications and improve global computing capabilities.

What You’ll Be Doing:

  • Lead research and development of end-to-end networking solutions for distributed AI training and inference at scale, with a focus on job completion time, failure resiliency, telemetry, scheduling, and placement.

  • Analyze current deployments, develop prototypes, and recommend architectural improvements.

  • Stay abreast of the latest research; become the team’s authority in emerging networking techniques and technologies.

  • Design, simulate, and validate new systems using novel, scalable network simulator NSX.

  • Develop and test prototypes on large-scale GPU clusters (e.g., Israel-1).

  • Collaborate across hardware, firmware, and software teams to translate ideas into real networking product features.

  • Publish patents and present research at leading conferences.

What We Need to See:

  • M.Sc. or PhD (preferred) in Computer Science, Electrical/Computer Engineering, or related field—or B.Sc. with research experience and publications.

  • 5+ years of relevant experience.

  • Deep expertise in networking and communication internals (NCCL, RDMA, congestion control, routing).

  • Strong software engineering skills in C++ and/or Python.

  • Excellent system-level design and problem-solving abilities.

  • Outstanding communication and collaboration skills across technical domains.

Ways to Stand Out from the Crowd:

  • Proven passion for solving sophisticated technical problems and delivering impactful solutions.

  • Record of publications in top-tier conferences.

  • Experience in designing and building large-scale AI training clusters.

  • Post-PhD research experience

  • Practical understanding of deep learning systems, GPU acceleration, and AI model execution flows.

Senior Software Research Architect, AI Networking

at Nvidia

Back to all Data Engineering jobs
N
Industry not specified

Senior Software Research Architect, AI Networking

at Nvidia

Mid LevelNo visa sponsorshipData Engineering

Posted 7 hours ago

No clicks

Compensation
Not specified USD

Currency: $ (USD)

City
Tel Aviv
Country
Israel

NVIDIA seeks a creative and practical Senior Software Architect to advance end-to-end AI networking for large-scale distributed training and inference. You will design, optimize, and deploy systems on NVIDIA Spectrum-X Networking Platform to manage inter-node communication, compute scheduling, and system-level optimization for generative AI workloads. The role involves prototyping, evaluating architectural improvements, collaborating across hardware, firmware, and software teams, and contributing to research with patents and conference publications.

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. Being an NVIDIAN means being part of a diverse and encouraging setting that encourages everyone to perform at their peak. Come join the team and discover how you can develop a lasting influence on the world.

NVIDIA is in search of a Senior Software Architect- a creative, forward-thinking, and practical researcher to improve the framework for widespread LLM learning and prediction. As part of our dynamic E2E Architecture group, you will design and optimize systems driving generative AI workloads, working at the intersection of software and hardware on some of the most advanced GPU clusters worldwide. You will define how AI models are deployed and scaled in production using the NVIDIA Spectrum-X Networking Platform, influencing decisions from inter-node communication and compute scheduling to system-level optimization. This is an opportunity to collaborate with best-in-class engineers and researchers and shape the future of generative AI in real-world applications. Your work will make a lasting impact by enabling generative AI technologies to reach real-world applications and improve global computing capabilities.

What You’ll Be Doing:

  • Lead research and development of end-to-end networking solutions for distributed AI training and inference at scale, with a focus on job completion time, failure resiliency, telemetry, scheduling, and placement.

  • Analyze current deployments, develop prototypes, and recommend architectural improvements.

  • Stay abreast of the latest research; become the team’s authority in emerging networking techniques and technologies.

  • Design, simulate, and validate new systems using novel, scalable network simulator NSX.

  • Develop and test prototypes on large-scale GPU clusters (e.g., Israel-1).

  • Collaborate across hardware, firmware, and software teams to translate ideas into real networking product features.

  • Publish patents and present research at leading conferences.

What We Need to See:

  • M.Sc. or PhD (preferred) in Computer Science, Electrical/Computer Engineering, or related field—or B.Sc. with research experience and publications.

  • 5+ years of relevant experience.

  • Deep expertise in networking and communication internals (NCCL, RDMA, congestion control, routing).

  • Strong software engineering skills in C++ and/or Python.

  • Excellent system-level design and problem-solving abilities.

  • Outstanding communication and collaboration skills across technical domains.

Ways to Stand Out from the Crowd:

  • Proven passion for solving sophisticated technical problems and delivering impactful solutions.

  • Record of publications in top-tier conferences.

  • Experience in designing and building large-scale AI training clusters.

  • Post-PhD research experience

  • Practical understanding of deep learning systems, GPU acceleration, and AI model execution flows.

SIMILAR OPPORTUNITIES

No similar jobs available at the moment.