Industry not specified

AI Computing Software Development Intern - 2026

at Nvidia

InternshipNo visa sponsorshipData Science/AI/ML

Posted 8 hours ago

No clicks

Compensation: Not specified
City: Not specified
Country: Not specified

Join NVIDIA's AI Compute team in Taiwan as an AI Computing Software Development Intern for 2026. Work on one of two tracks—TensorRT-LLM (Python/PyTorch) or TensorRT Compiler (C++)—to build high-performance AI inference pipelines and optimize model execution, memory usage, and scalability. Collaborate with framework, research, CUDA, and hardware teams to deliver efficient multi-GPU model serving and graph transformations. This internship targets students pursuing advanced degrees in computer science, engineering, or related fields.

We are now looking for an AI Computing Software Development Intern!

NVIDIA invites skilled interns in artificial intelligence computing solutions to join our AI Compute team in Taiwan. This is your chance to work on one of the globe’s most advanced AI systems. You will help develop technologies for Large Language Models, Recommender Systems, and Generative AI, and push the limits of GPU performance for AI inference.

What you'll be doing:

As an intern, you’ll focus on one of two specialized tracks: TensorRT-LLM – Inference Optimization (Python / PyTorch) or TensorRT Compiler – Graph Optimization (C++).

For TensorRT-LLM:

Build and enhance high‑performance LLM inference pipelines.
Analyze and optimize model execution, scalability, and memory use.
Collaborate across framework and research teams to deliver efficient multi‑GPU model serving.

For TensorRT Compiler:

Work on the TensorRT compiler backend to improve graph transformations and code generation for NVIDIA GPUs.
Develop compiler optimization passes, refine operator fusion, and optimize memory usage.
Collaborate with CUDA and hardware architecture teams to accelerate Deep Learning inference computations.

What we need to see:

Pursuing an M.S. or Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, Applied Mathematics, or related fields.
Excellent problem‑solving ability, curiosity for cutting‑edge AI systems, and passion for GPU computing and deep learning software performance.
TensorRT‑LLM: Strong Python programming and experience with PyTorch; solid understanding of LLM inference and GPU acceleration.
TensorRT Compiler: Proficient in C++, with experience in compiler or performance optimization.

Join us and play a part in building the AI computing platforms that drive innovation across industries worldwide.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

What you'll be doing:

As an intern, you’ll focus on one of two specialized tracks: TensorRT-LLM – Inference Optimization (Python / PyTorch) or TensorRT Compiler – Graph Optimization (C++).

For TensorRT-LLM:

Build and enhance high‑performance LLM inference pipelines.

Analyze and optimize model execution, scalability, and memory use.

Collaborate across framework and research teams to deliver efficient multi‑GPU model serving.

For TensorRT Compiler:

Work on the TensorRT compiler backend to improve graph transformations and code generation for NVIDIA GPUs.

Develop compiler optimization passes, refine operator fusion, and optimize memory usage.

Collaborate with CUDA and hardware architecture teams to accelerate Deep Learning inference computations.

What we need to see:

Pursuing an M.S. or Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, Applied Mathematics, or related fields.

Excellent problem‑solving ability, curiosity for cutting‑edge AI systems, and passion for GPU computing and deep learning software performance.

TensorRT‑LLM: Strong Python programming and experience with PyTorch; solid understanding of LLM inference and GPU acceleration.

TensorRT Compiler: Proficient in C++, with experience in compiler or performance optimization.

Join us and play a part in building the AI computing platforms that drive innovation across industries worldwide.