Vincent Sitzmann

Associate Professor, MIT EECS

Hi, I'm Vincent!

I lead the Scene Representation Group at MIT CSAIL, where we build machines that learn to understand and interact with our world autonomously.

Highlighted Work

See our group website for more publications.

MilliVid: Hierarchical Latents for Long-Range Consistency in Video Generation

Ishaan Preetam Chandratreya *, David Charatan *, Basile Van Hoorick , Sergey Zakharov , Vitor Guizilini , Phillip Isola , Vincent Sitzmann

arXiv 2026

Website Code arXiv

VERA: Turning Video Models into Generalist Robot Policies

Sizhe Lester Li *, Evan Kim*, Xingjian Bai*, Tong Zhao, Tao Pang, Max Simchowitz , Vincent Sitzmann

arXiv 2026

Website Code arXiv

Large Video Planner

Boyuan Chen , Tianyuan Zhang, Haoran Geng , Kiwhan Song , William T. Freeman , Jitendra Malik , Russ Tedrake , Vincent Sitzmann , Yilun Du

arXiv 2025

Website arXiv

Generative View Stitching

Chonghyuk Song , Michal Stary, Boyuan Chen , George Kopanas , Vincent Sitzmann

ICLR 2025

Website Code Twitter arXiv

True Self-Supervised Novel View Synthesis is Transferable

Thomas W. Mitchel *, Hyunwoo Ryu *, Vincent Sitzmann

Oral • ICLR 2025

Website Code Twitter arXiv

Selective Underfitting in Diffusion Models

Kiwhan Song , Jaeyeon Kim, Sitan Chen, Yilun Du , Sham Kakade, Vincent Sitzmann

arXiv 2025

Website Twitter arXiv

Locality in Image Diffusion Models Emerges from Data Statistics

Artem Lukoianov , Chenyang Yuan, Justin Solomon , Vincent Sitzmann

Spotlight • NeurIPS 2025

Website Code arXiv

Meschers: Geometry Processing of Impossible Objects

Ana Dodik , Isabella Yu , Kartik Chandra, Jonathan Ragan-Kelley, Joshua Tenenbaum, Vincent Sitzmann , Justin Solomon

SIGGRAPH 2025

Paper Website

Controlling diverse robots by inferring Jacobian fields with deep networks

Sizhe Li , Annan Zhang , Boyuan Chen , Hanna Matusik, Chao Liu , Daniela Rus , Vincent Sitzmann

Nature Journal Article • Nature 2025

Paper Website Code Twitter arXiv

History-Guided Video Diffusion

Kiwhan Song *, Boyuan Chen *, Max Simchowitz , Yilun Du , Russ Tedrake , Vincent Sitzmann

ICML 2025

Website Code Twitter arXiv

Blog

Thoughts on research, teaching, and AI. View all posts

Feb 1, 2026

The flavor of the bitter lesson for computer vision

Computer vision conventionally maps images to intermediate representations (class, segmentation, 3D reconstruction, etc), but the "real" role of vision has always been as part of a perception-action loop. The LLM moment for vision won't mean SOTA on intermediate tasks, but instead intelligent, embodied agents. World models are the first glimpse. 3D in particular will become obsolete for training embodied intelligence models.

Sep 24, 2025

Make It Work, Then Prove It Works: A Framework for Research

There are two disparate skillsets required for good research: moving fast to find what works, then slowing down to prove it. Knowing when to switch between them is crucial.

Teaching

Courses I have taught at MIT.

Advances in Computer Vision

Spring 2026 • 6.8300

Advances in Computer Vision

Spring 2025 • 6.8300

Recent Talks

Selected talks and presentations.

Modeling the world (and yourself) from vision

Toronto's Vision Group Lecture Series

Short Biography

Vincent Sitzmann is an Associate Professor at MIT where he leads the Scene Representation Group. He received his PhD at Stanford University with Gordon Wetzstein and his Bachelor's degree from the Technical University of Munich. Vincent's research goal is to build embodied artificial general intelligence - essentially, robots that can inhabit our world alongside humans, that explore it on their own volition, and that - just like humans - can expand their understanding of our world by interacting with it and learning from their interactions. In the past, Vincent's research has made contributions to generative modeling, computer graphics, and robotics, pioneering topics such as differentiable rendering for 3D reconstruction, neural implicit representations, and video generative models for world models and robotics. Vincent has been awarded with numerous awards, such as the MIT Junior Bose Award for excellence in teaching, the NSF CAREER award, and the TC-PAMI Young Researcher award.