LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Senior Deep Learning Inference Performance Architect

at Nvidia

Back to all Data Science / AI / ML jobs
N
Industry not specified

Senior Deep Learning Inference Performance Architect

at Nvidia

Mid LevelNo visa sponsorshipData Science/AI/ML

Posted 6 hours ago

No clicks

Compensation
$184,000 – $356,500 USD

Currency: $ (USD)

City
Not specified
Country
United States

Senior Deep Learning Inference Performance Architect at NVIDIA. The role involves writing performance-optimized low-level code for GPUs, CUDA kernels, and AI inference software; evaluating and improving performance techniques in production LLM deployments; and guiding future GPU architecture decisions. You will collaborate across software, research and product teams to push AI inference performance and efficiency.

We are now looking for a Senior Deep Learning Inference Performance Architect!

NVIDIA is seeking a Senior Performance Architect - a creative engineer who loves to squeeze out every cycle of performance from deep learning software. The Inference Architecture team does groundbreaking hardware-software co-design work that focuses on accelerating AI Inference workloads. In this role, you will write performance optimized low level code on today’s GPUs, evaluate and improve state-of-the-art performance techniques in production Large Language Model deployments, and help guide our future GPU architecture decisions. If you are someone who enjoys digging deep into GPU architecture details, are passionate about AI, and know where every cycle goes when you write highly tuned software, this role may be a great fit for you.

What you’ll be doing:

  • Develop innovative GPU and system architectures to extend the state of the art in AI Inference performance and efficiency

  • Model, analyze and prototype key deep learning algorithms and applications

  • Understand and analyze the interplay of hardware and software architectures on future algorithms and applications

  • Write efficient software for AI Inference, including CUDA kernels, framework level code, and application level code

  • Collaborate across the company to guide the direction of AI, working with software, research and product teams

What we need to see:

  • A MS or PhD in a relevant discipline (CS, EE, Math) or equivalent experience, with 5+ years or relevant experience

  • Strong mathematical foundation in machine learning and deep learning

  • Expert programming skills in C, C++, and Python

  • Familiarity with GPU computing (CUDA or similar) and HPC (MPI, OpenMP)

  • Strong knowledge and coursework in computer architecture

Ways to stand out from the crowd:

  • Background with systems-level performance modeling, profiling, and analysis

  • Experience in characterizing and modeling system-level performance, executing comparison studies, and documenting and publishing results

  • Experience in optimizing AI Inference workloads with CUDA kernel development

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hard working people in the world working for us. If you're creative, autonomous, and love a challenge, consider joining our Inference Performance Architecture team and help us build the real-time, cost-effective computing platform driving our success in this exciting and quickly growing field.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until January 13, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Senior Deep Learning Inference Performance Architect

at Nvidia

Back to all Data Science / AI / ML jobs
N
Industry not specified

Senior Deep Learning Inference Performance Architect

at Nvidia

Mid LevelNo visa sponsorshipData Science/AI/ML

Posted 6 hours ago

No clicks

Compensation
$184,000 – $356,500 USD

Currency: $ (USD)

City
Not specified
Country
United States

Senior Deep Learning Inference Performance Architect at NVIDIA. The role involves writing performance-optimized low-level code for GPUs, CUDA kernels, and AI inference software; evaluating and improving performance techniques in production LLM deployments; and guiding future GPU architecture decisions. You will collaborate across software, research and product teams to push AI inference performance and efficiency.

We are now looking for a Senior Deep Learning Inference Performance Architect!

NVIDIA is seeking a Senior Performance Architect - a creative engineer who loves to squeeze out every cycle of performance from deep learning software. The Inference Architecture team does groundbreaking hardware-software co-design work that focuses on accelerating AI Inference workloads. In this role, you will write performance optimized low level code on today’s GPUs, evaluate and improve state-of-the-art performance techniques in production Large Language Model deployments, and help guide our future GPU architecture decisions. If you are someone who enjoys digging deep into GPU architecture details, are passionate about AI, and know where every cycle goes when you write highly tuned software, this role may be a great fit for you.

What you’ll be doing:

  • Develop innovative GPU and system architectures to extend the state of the art in AI Inference performance and efficiency

  • Model, analyze and prototype key deep learning algorithms and applications

  • Understand and analyze the interplay of hardware and software architectures on future algorithms and applications

  • Write efficient software for AI Inference, including CUDA kernels, framework level code, and application level code

  • Collaborate across the company to guide the direction of AI, working with software, research and product teams

What we need to see:

  • A MS or PhD in a relevant discipline (CS, EE, Math) or equivalent experience, with 5+ years or relevant experience

  • Strong mathematical foundation in machine learning and deep learning

  • Expert programming skills in C, C++, and Python

  • Familiarity with GPU computing (CUDA or similar) and HPC (MPI, OpenMP)

  • Strong knowledge and coursework in computer architecture

Ways to stand out from the crowd:

  • Background with systems-level performance modeling, profiling, and analysis

  • Experience in characterizing and modeling system-level performance, executing comparison studies, and documenting and publishing results

  • Experience in optimizing AI Inference workloads with CUDA kernel development

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hard working people in the world working for us. If you're creative, autonomous, and love a challenge, consider joining our Inference Performance Architecture team and help us build the real-time, cost-effective computing platform driving our success in this exciting and quickly growing field.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until January 13, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

SIMILAR OPPORTUNITIES

No similar jobs available at the moment.