• Author(s) : Junyu Xie, Charig Yang, Weidi Xie, Andrew Zisserman

Motion segmentation, the task of discovering and segmenting moving objects in a video, has been a widely studied area with various approaches and training schemes. This paper investigates the potential of the Segment Anything Model (SAM) in contributing to this task.

The authors propose two models that combine SAM with optical flow to leverage SAM’s segmentation capabilities and flow’s ability to identify and group moving objects. The first model adapts SAM to accept optical flow as input instead of RGB, while the second model uses RGB as input and flow as a segmentation prompt. These straightforward methods, without further modifications, surpass all previous approaches by a significant margin in both single and multi-object benchmarks. Additionally, the paper extends these frame-level segmentations to sequence-level segmentations, maintaining object identity.

The proposed simple model outperforms previous methods on multiple video object segmentation benchmarks, demonstrating the effectiveness of integrating SAM with optical flow for motion segmentation tasks. The findings highlight the potential of SAM in advancing the field of motion segmentation and open up new possibilities for future research and applications.