LOG IN
SIGN UP
Tech Job Finder - Find Software, Technology Sales and Product Manager Jobs.
Sign In
OR continue with e-mail and password
E-mail address
Password
Don't have an account?
Reset password
Join Tech Job Finder
OR continue with e-mail and password
E-mail address
First name
Last name
Username
Password
Confirm Password
How did you hear about us?
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Devtech Compute Engineer

at Nvidia

Back to all C/C++ jobs
N
Industry not specified

Devtech Compute Engineer

at Nvidia

JuniorNo visa sponsorshipC/C++/C#

Posted 12 hours ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Beijing
Country
China

Join NVIDIA's Devtech team to develop performance-critical CUDA/C++ code for deep learning applications, optimizing model training and inference on GPUs. You will investigate performance bottlenecks, implement optimizations, and integrate results into open-source libraries. Collaborate with GPU, CPU, and networking teams to define next-generation hardware and software solutions, covering domains like LLM, Recsys, Robotics, and Assisted Driving.

We’re working on the next generation of recommendation tools and pushing the boundaries of accelerating model training and inference on GPU. You’ll join a team of ML, HPC and Software Engineers and Applied Researcher developing a framework designed to make the productization of GPU-based recommender systems as simple and fast as possible.

What you'll be doing:

  • In your role as Devtech Compute Engineer or CUDA Performance Engineer you will be primarily for the development of performance critical code of our deep learning applications with the goal of establishing world class performance for our customers.

  • This includes investigating the current performance and exploring optimization opportunities together with the global developers.

  • Important part of the work is that once optimal performance has been demonstrated that these solutions are integrated into our open source software libraries like ACCV-Lab, Recsys-Example .

  • With the knowledge to the requirements from customers and performance bottleneck, you will also work with our GPU, CPU, Network team to define the next generation hardware and software solutions.

  • Our coverage is wide including: LLM, Recsys, Robotic, Assisted Driving.

What we need to see:

  • 2+ years of experience of c++ code development in collaborative software development projects

  • Skilled at writing CUDA kernels and optimizing code

  • Basic knowledge of ML algorithms and deep learning

  • Basic knowledge and understanding of mathematical topics including linear algebra, calculus and statistics

  • Experience with algorithms and optimization

  • Python and jupyter notebook for analysis, algorithm exploration and processing

  • High standard for code quality and rigorous testing practices

  • Conversational level English proficiency

  • Some experience with Linux, openMP and MPI

Way to stand out from the crowd:

  • Experience in c++ HPC code development / PhD in related fields

  • Able to perform in-depth performance analysis, can demonstrate to model the performance with mathematical and statistical considerations

  • Linear algebra, calculus and statistics as second nature and this is reflected in your background of mathematics, physics, applied science or HPC related field

  • Demonstrate the ability to write CUDA kernels with the purpose of utilizing the hardware to its full potential.

  • Write unit tests and validate the correctness of the optimizations as well as strive for and propose optimal solutions and ambitious goals, convince and help others to do the same

#deeplearning

Devtech Compute Engineer

at Nvidia

Back to all C/C++ jobs
N
Industry not specified

Devtech Compute Engineer

at Nvidia

JuniorNo visa sponsorshipC/C++/C#

Posted 12 hours ago

No clicks

Compensation
Not specified

Currency: Not specified

City
Beijing
Country
China

Join NVIDIA's Devtech team to develop performance-critical CUDA/C++ code for deep learning applications, optimizing model training and inference on GPUs. You will investigate performance bottlenecks, implement optimizations, and integrate results into open-source libraries. Collaborate with GPU, CPU, and networking teams to define next-generation hardware and software solutions, covering domains like LLM, Recsys, Robotics, and Assisted Driving.

We’re working on the next generation of recommendation tools and pushing the boundaries of accelerating model training and inference on GPU. You’ll join a team of ML, HPC and Software Engineers and Applied Researcher developing a framework designed to make the productization of GPU-based recommender systems as simple and fast as possible.

What you'll be doing:

  • In your role as Devtech Compute Engineer or CUDA Performance Engineer you will be primarily for the development of performance critical code of our deep learning applications with the goal of establishing world class performance for our customers.

  • This includes investigating the current performance and exploring optimization opportunities together with the global developers.

  • Important part of the work is that once optimal performance has been demonstrated that these solutions are integrated into our open source software libraries like ACCV-Lab, Recsys-Example .

  • With the knowledge to the requirements from customers and performance bottleneck, you will also work with our GPU, CPU, Network team to define the next generation hardware and software solutions.

  • Our coverage is wide including: LLM, Recsys, Robotic, Assisted Driving.

What we need to see:

  • 2+ years of experience of c++ code development in collaborative software development projects

  • Skilled at writing CUDA kernels and optimizing code

  • Basic knowledge of ML algorithms and deep learning

  • Basic knowledge and understanding of mathematical topics including linear algebra, calculus and statistics

  • Experience with algorithms and optimization

  • Python and jupyter notebook for analysis, algorithm exploration and processing

  • High standard for code quality and rigorous testing practices

  • Conversational level English proficiency

  • Some experience with Linux, openMP and MPI

Way to stand out from the crowd:

  • Experience in c++ HPC code development / PhD in related fields

  • Able to perform in-depth performance analysis, can demonstrate to model the performance with mathematical and statistical considerations

  • Linear algebra, calculus and statistics as second nature and this is reflected in your background of mathematics, physics, applied science or HPC related field

  • Demonstrate the ability to write CUDA kernels with the purpose of utilizing the hardware to its full potential.

  • Write unit tests and validate the correctness of the optimizations as well as strive for and propose optimal solutions and ambitious goals, convince and help others to do the same

#deeplearning

SIMILAR OPPORTUNITIES

No similar jobs available at the moment.