• Author(s): Xi Chen, Yutong Feng, Mengting Chen, Yiyang Wang, Shilong Zhang, Yu Liu, Yujun Shen, Hengshuang Zhao

The paper titled “Zero-shot Image Editing with Reference Imitation” introduces a novel approach to image editing that simplifies the process for users by allowing them to draw inspiration from reference images found online. This method, termed imitative editing, addresses the challenge of precisely describing the desired outcome of an edited image. Instead of requiring users to manually match the reference image with the source image, the system automatically determines the relevant features from the reference to perform the editing.

The core of this approach is a generative training framework called MimicBrush. MimicBrush operates by randomly selecting two frames from a video clip, masking certain regions of one frame, and then learning to recover these masked regions using information from the other frame. This self-supervised learning process enables the model to capture the semantic correspondence between different images effectively. The model is built on a diffusion prior, which helps it understand and replicate the visual concepts from the reference image into the source image seamlessly.

MimicBrush employs dual diffusion U-Nets to handle the source and reference images. The masked source image is processed by an imitative U-Net, while the reference image is processed by a reference U-Net. The attention keys and values from the reference U-Net are then injected into the imitative U-Net, assisting in completing the masked regions. This method allows the model to overcome variations in poses, lighting, and even different categories between the source and reference images, ensuring that the generated region preserves the details and harmoniously blends with the background.

To evaluate the effectiveness of MimicBrush, the authors constructed a high-quality benchmark for imitative editing, which includes tasks such as part composition and texture transfer. The benchmark covers various practical applications, including fashion and product design, and provides a systematic way to assess the performance of the proposed method. In summary, “Zero-shot Image Editing with Reference Imitation” presents a significant advancement in image editing technology. By leveraging the MimicBrush framework, the method allows for more intuitive and effective image editing, enabling users to achieve their creative goals with greater ease and precision. This research opens up new possibilities for practical applications in various fields that require sophisticated image editing capabilities.