Matterport can transform any room in your home into a renovated space without moving any furniture.
Imagine completely redecorating your living room without moving any furniture – that's what Matterport is all about.
Matterport applies the concept of artificial intelligence to explore how advanced technologies in 3D semantic understanding and in-line drawing can bring a series of exciting new applications to digital twins.
Matterport initially focused on creating realistic yet static reconstructions of real-world spaces, laying a solid foundation for virtual tourism and various consumer applications . However, static reconstructions alone are insufficient for truly transforming these spaces, assessing their potential uses, or managing their day-to-day maintenance and operations. Therefore, the company has been developing advanced property intelligence tools that leverage semantic understanding to provide deeper insights and valuable information about properties.
Now, with the latest breakthroughs in artificial intelligence generation technology, the company is expanding its focus to creating new content and experiences within the Matterport space to enrich how users interact with and perceive these digital environments.
Combining Matterport’s decade of experience in machine learning and artificial intelligence with the power of new generative AI tools, they are bringing new design and furniture ideas to life through Project Genesis, making them possible with the click of a button— first and foremost, the ability to instantly renovate any space .
What is furniture removal?
Defurnishing is a key technology in digital image processing and 3D modeling. It involves removing furniture and movable objects from spatial images to make the space appear empty.
This approach is crucial for applications that require visualization of unused spaces, including interior design, real estate, and virtual stages, as it clearly demonstrates the potential of the space.
Furniture removal is a feature that all Matterport digital twins are developing, and it involves three steps:
1. Reconstruction: First capture and reconstruct the space to create a digital twin.
2. Understanding: Then, semantic understanding is performed on the reconstructed space, specifically identifying the pixels (in the image) and grid faces (in the dollhouse view) that belong to the furniture items that are to be removed.
3. Compositing: Since we never directly captured the areas obscured by furniture, blank pixels and holes will appear in the image after the furniture is removed. The "blank spaces" in the image need to be rendered internally, while the holes in the mesh need to be filled and textured.
In 3DMart's article announcing the winter 2024 release of Matterport , you can preview the manufacturer's dust removal capabilities. This section of the blog series will focus on semantic segmentation—a crucial first step in automatic dust removal.
The following is the Chinese version of the video released by Matterport this winter:
Understanding Semantic Segmentation
Semantic segmentation is an important computer vision task that involves dividing an image into different regions and assigning a specific category to each region. The goal is to label each pixel with a category (such as "floor, " " wall, " "window," or " table " ), thereby facilitating a comprehensive understanding of the scene by accurately locating objects and defining their boundaries.
Object detection focuses on objects with bounding boxes, while image classification applies a single label to the entire image. Semantic segmentation, however, is different; it enables fine-grained scene analysis, improving the depth of interpretation. Semantic segmentation is a fundamental technology in computer vision, with applications in autonomous vehicles, medical imaging, and robotics.
Recently, it has become a key element of virtual interior design. During the initial capture of a space, the available primary data outlines its overall structure and aesthetics. Semantic segmentation plays a crucial role in enriching the understanding of Matterport's spatial content, enabling precise operations —whether moving, editing, indexing, or deleting elements.
To effectively alter any aspect of the Matterport space, detailed semantic segmentation is necessary to distinguish the key components of the space.
The role of segmentation in de-furnitureization
To remove furniture from the imagery and 3D structure of a digital twin, the individual pixels/mesh faces belonging to the furniture item must first be identified. Removing these pixels/faces often results in missing information. This is because the area behind/under the furniture is not visible when capturing a digital twin.
Therefore, after the furniture is removed, some reliable image/3D content needs to be generated to fill these gaps. This process is called "image rendering".
In-image retouching is an advanced technique used for image editing and restoration , designed to fill in missing or damaged portions of an image, ensuring it looks complete and natural. Its main purpose is to seamlessly reconstruct these areas, allowing them to blend perfectly with the surrounding image, thus maintaining the image's structural integrity and visual continuity.
Many in-painting methods rely on precise segmentation masking of specified removal and subsequent in-painting areas. Any differences or artificial traces affecting the furniture segmentation masking can significantly impact the in-painting result; for example:
• Removing parts of a building's structure instead of furniture can lead to severe structural illusions (for example, it might end up creating a doorway to a non-existent room instead of smearing some floor and wall content).
• Incorrect furniture segmentation, i.e., failure to properly occlude object parts, can lead to unintentionally drawing false objects instead of the desired " empty space " (which, depending on the perspective, is usually interpreted as walls and floors).
• False negatives can occur when the actual furniture is not divided, resulting in residual parts of the furniture appearing in the final result.
Therefore, ensuring accurate semantic segmentation is crucial for achieving high-quality de-furnishing results.
Matterport's semantic segmentation method
1. Data
Matterport uses isometric projection to perform semantic segmentation on 360-degree panoramic imagery in order to capture the widest possible visual background within a single frame. Context plays a crucial role in computer vision tasks, especially when using modern neural network frameworks such as Vision Transformers.
2. Customize the ontology
Initially, the manufacturer used a portion of the ADE20k ontology, which includes 150 categories commonly found in the built environment. However, this approach did not fully meet specific needs.
The Matterport approach aims to eliminate all removable furniture while retaining built-in furniture. Public data sets typically categorize these different types of furniture into general categories (e.g., simply classifying freestanding wardrobes and built-in wardrobes as "wardrobes").
Therefore, in order to meet specific needs, several other specific task factors must be considered, and a custom data set with furniture subdivision annotations must be compiled.
3. Network Architecture
Matterport decided to leverage the capabilities of its visual converter architecture, which has been successfully used in various AI applications within its projects, specifically choosing the visual converter adapter as the basis for its segmentation experiments. This model modifies the visual converter, which was originally designed to generate a single feature vector from image input, enabling it to handle image-to-image tasks that require feature maps rather than single vectors.
Although the ViT-Adapter was not specifically trained for 360-degree isometric images, it has shown impressive performance in handling this data type, even though it was not originally designed to address the aforementioned ontology discrepancies.
4. Deployment
Recently, Matterport has elevated semantic segmentation to a central position in its pipeline, alongside depth estimation, so semantic segmentation is now performed for every image captured. As a result, the plant 's inference runs in the cloud, making it more resilient to sudden traffic fluctuations, simplifying maintenance, and enabling smoother updates.
5. 3D Semantic Understanding
Matterport possesses unique advantages in 3D spatial semantic understanding. By incorporating 3D context into semantic segmentation, it can gain a deeper understanding of the spatial and semantic connections within any captured space. The manufacturer innovatively used 3D dollhouse views, combining multiple perspectives to significantly improve prediction accuracy. This advanced approach enables more accurate and meaningful modifications.
A typical example is a furniture-free scene , which requires a complex and accurate understanding of the 2D and 3D features of the environment.
Technical challenges and limitations of Defurnishing
Even the most advanced semantic segmentation models cannot achieve perfection and are difficult to effectively generalize to new, unseen data. This reality requires Matterport to develop strategies to correct errors or create workarounds.
While supervised semantic segmentation methods typically yield the best results, defining and managing ontologies presents significant challenges. These ontologies are easily transformed and changed for specific applications, requiring frequent data annotation during major adjustments. Therefore, the more self-supervised the model training, the less time, effort, and resources are needed to adjust the segmentation model based on new ontologies. Designing these ontologies presents numerous challenges. For example, Matterport aims to disassemble "freestanding" furniture while retaining "built-in" components.
Determining when a piece of furniture qualifies as "embedded" is a complex task that typically requires a comprehensive set of rules to ensure consistency and repeatability of decisions. Without clear guidelines, data annotation is likely to produce low-quality results, which in turn can negatively impact the performance of segmentation models.
Looking to the future
Self-monitored learning
Matterport has been exploring self-supervised learning for some time, and with the successful rollout of various image-based models, now is the perfect time to deepen its investment in this area.
Self-monitored learning has significant advantages , such as minimizing the need for annotation materials, accelerating the training process, and improving performance on specific tasks.
Integrating 3D Context
Exploring the integration of 3D context into workflows offers a promising avenue for advancement. Currently, Matterport's data aggregation method is passive, relying on a heuristic approach to weight features projected from multiple views. By researching methods to integrate 3D context during the training phase, it is possible to develop viewpoint-independent features, thereby enhancing the model's understanding capabilities.
Furthermore, the company is exploring the potential of end-to-end 3D technology to see if directly processing semantic understanding through 3D representations can improve results. This includes re-evaluating reconstruction methods. Employing cutting-edge technologies such as Neural Radiation Fields (NeRFs) or other innovative strategies could fundamentally change current practices, thereby significantly improving model understanding capabilities and performance.
Multitasking Model
The idea that multi-task models can perform multiple tasks simultaneously has been a focus of attention. However, these models need to be maintained as a cohesive system, making the strategy of adopting a shared backbone across multiple models more attractive.
As the factory progresses, finding the right balance between the advantages and complexities of a multi-task model will be key to improving workflows and results.
Open-ended vocabulary model
Another exciting area of development is open-vocabulary models . Traditional models are limited by fixed ontologies and can be constrained by the breadth of customer needs.
However, open-ended lexical models break free from these limitations; they are able to recognize a wider range of objects and concepts, and are not restricted by predefined categories.
This adaptability is invaluable to Matterport, enabling broader semantic understanding across various spaces and applications. Adopting an open-ended lexical approach promises to significantly enhance our ability to meet diverse customer needs and improve the interoperability of our assets with other tools.
in conclusion
Expanding semantic understanding of space will unlock a range of applications across multiple industries. Recognizing that a single ontology cannot meet the needs of all customers, Matterport sees the value in open-lexical technologies and other methods not constrained by strict ontology frameworks.
Another goal is to improve the compatibility of resources with various tools. To this end, we are developing multiple integrations to ensure that the final presentation of empty spaces is accurate and visually consistent.
Matterport PRO3 is a professional 3D panoramic/spatial scanner with a high-quality 134-megapixel resolution. Paired with Matterport Capture indoor environment 3D scanning software, it can quickly 3D scan various spaces of different sizes with just one click, instantly generating high-precision 2D floor plans and 3D virtual spaces!
Want to learn more about Matterport products? Contact us below!
-Contact Us-
SanDiMa offers more than just 3D printing ; we provide three major OEM services: " 3D Printing Manufacturing ," " 3D Scanning Services ," and " Spatial 3D Scanning Services "!
Follow our fan page and stay up-to-date with the latest news:
Facebook | Instagram | LinkedIn