Skip to main navigation Skip to search Skip to main content

Shape-for-Motion: Precise and Consistent Video Editing With 3D Proxy

YUHAO LIU, TENGFEI WANG*, FANG LIU, ZHENWEI WANG, RYNSON W. H. LAU*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Recent advances in deep generative modeling have unlocked unprecedented opportunities for video synthesis. In real-world applications, however, users often seek tools to faithfully realize their creative editing intentions with precise and consistent control. Despite the progress achieved by existing methods, ensuring fine-grained alignment with user intentions remains an open and challenging problem. In this work, we present Shape-for-Motion, a novel framework that incorporates a 3D proxy for precise and consistent video editing. Shape-for-Motion achieves this by converting the target object in the input video to a time-consistent mesh, i.e., a 3D proxy, allowing edits to be performed directly on the proxy and then inferred back to the video frames. To simplify the editing process, we design a novel Dual-Propagation Strategy that allows users to perform edits on the 3D mesh of a single frame, and the edits are then automatically propagated to the 3D meshes of the other frames. The 3D meshes for different frames are further projected onto the 2D space to produce the edited geometry and texture renderings, which serve as inputs to a decoupled video diffusion model for generating edited results. Our framework supports various precise and physically-consistent manipulations across the video frames, including pose editing, rotation, scaling, translation, texture modification, and object composition. Our approach marks a key step toward high-quality, controllable video editing workflows. Extensive experiments demonstrate the superiority and effectiveness of our approach. © 2025 Copyright held by the owner/author(s).
Original languageEnglish
Title of host publicationSA Conference Papers '25
Subtitle of host publicationProceedings of the SIGGRAPH Asia 2025 Conference Papers
PublisherAssociation for Computing Machinery
Number of pages12
ISBN (Electronic)979-8-4007-2137-3
DOIs
Publication statusPublished - 14 Dec 2025
Event18th ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH ASIA 2025) - Hong Kong Convention and Exhibition Centre (HKCEC), Hong Kong, China
Duration: 15 Dec 202518 Dec 2025
https://asia.siggraph.org/2025/

Conference

Conference18th ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH ASIA 2025)
Abbreviated titleSA '25
PlaceHong Kong, China
Period15/12/2518/12/25
Internet address

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Research Keywords

  • 3D-Aware Video Editing
  • Generative Model

Fingerprint

Dive into the research topics of 'Shape-for-Motion: Precise and Consistent Video Editing With 3D Proxy'. Together they form a unique fingerprint.

Cite this