Reinforcement Learning and World Model for Autonomous Driving Intern - 2026
at Nvidia
Posted 7 hours ago
No clicks
- Compensation
- Not specified
- City
- Shanghai
- Country
- China
Currency: Not specified
Join Nvidia as an intern focusing on reinforcement learning and multi-modal world simulation for autonomous driving. Develop and refine world models, train self-supervised dynamics and sensor generation models, and prototype architectures combining world models, diffusion or flow-based models, and policy gradients for realistic simulation. Collaborate with End-to-End Driving Model teams to deploy world-model-based policies in simulated RL environments and support Sim2Real transfer. This role offers the opportunity to shape AI-powered autonomous driving with state-of-the-art simulation tech.
We are in search of a hardworking intern with expertise in Reinforcement Learning and Multi-Modal World Simulation Model to propel the evolution of ML-centric autonomous driving and Physical AI solutions. The focus of this role lies in model-centric RL, learning about world simulation models, and translating state-of-the-art (SOTA) algorithms into real-world applications, allowing vehicles to interpret, anticipate, and respond astutely in challenging dynamic contexts. This is a rare opportunity to shape the next frontier of intelligent driving, where imagination meets real-world impact. If you’re excited by the idea of building SOTA simulation techs and systems that learn, adapt, and truly “think,” we’d love to have you on board. Join us, join a team where your input plays a crucial role in fast-tracking the growth of autonomous vehicles with the state of art solutions.
What you'll be doing:
Develop and refine multi-modal world models and integrate them into our simulation system.
Train and evaluate self-supervised latent dynamics and sensor generation models for the joint tasks of trajectory prediction, goal-conditioned ego control, and sensor data synthesis. Explore and prototype hybrid architectures combining world models, generative (e.g., diffusion, flow matching) models, and policy gradients for realistic and robust simulation.
Collaborate with End-to-End Driving Model teams to deploy world-model-based policies to simulated RL environments and accelerate the training of the driving systems.
Contribute to system development for continuous learning and simulation adaptation (Sim2Real transfer).
What we need to see:
Pursuing PhD in Computer Science, Machine Learning, or a related field, with neural rendering, robotics, or simulation background.
Strong understanding of reinforcement learning (policy gradients, actor-critic, offline RL).
Familiarity with visual representation learning and 4D scene representation (NeRF, Gaussian Splatting, occupancy networks and contrastive, masked modeling, or generative world simulation) for world simulation.
Experience building large-scale training pipelines with temporal consistency and simulation data replay.
Publications or open-source contributions in RL, model-based control, or autonomous systems.
Passion for developing learning systems that can “imagine” and plan in the real world.

