• Author(s) : Yu-Ju Tsai, Jin-Cheng Jhang, Wei Wang, Albert Y. C. Chen, Min Sun, Cheng-Hao Kuo, Ming-Hsuan Yang

The task of 360° room layout estimation presents a unique challenge due to the inherent ambiguity in layout annotations. To address this issue, researchers have proposed an innovative model named Bi-Layout. This model takes a unique approach by predicting two distinct layout types, each serving a specific purpose.

The first layout type stops at ambiguous regions, providing a conservative estimate, while the second layout type extends to encompass all visible areas, offering a more comprehensive view. Bi-Layout achieves this through the use of two global context embeddings, each designed to capture specific contextual information relevant to its respective layout type.

One of the key strengths of the Bi-Layout model is its feature guidance module. This module enables the model to generate layout-aware features by allowing the image feature to retrieve relevant context from the global context embeddings. As a result, Bi-Layout can make precise predictions while effectively handling ambiguous regions.

Another notable aspect of Bi-Layout is its ability to inherently detect ambiguous regions by comparing the two predicted layouts. To further enhance the model’s performance, the researchers introduced a new metric for disambiguating ground truth layouts, eliminating the need for manual correction of annotations during testing.

Through rigorous evaluations on benchmark datasets, Bi-Layout has demonstrated superior performance, outperforming leading approaches in the field. On the Matterport Layout dataset, it achieved an improvement in 3DIoU from 81.70% to 82.57% across the full test set. Even more impressively, in subsets with significant ambiguity, it boosted 3DIoU from 54.80% to 59.97%.

This paper showcases the ongoing advancements in 3D layout estimation and highlights the potential of the Bi-Layout model to overcome challenges posed by ambiguous annotations. It presents a step forward in the quest for accurate and robust layout estimation in 360° environments.