Segment 3D point clouds using foundation models for 2D vision by with Dong
Segment 3D point clouds using foundation models for 2D vision by with Dong
Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation
arXiv paper abstract https://arxiv.org/abs/2311.01989
arXiv PDF paper https://arxiv.org/pdf/2311.01989.pdf
… Segment-Anything Model (SAM) and Contrastive Language-Image Pre-training (CLIP) … foundation vision models … capture knowledge from … broad data … enabling … zero-shot segmentation … their potential for … 3D … understanding … relatively unexplored.
… present a novel framework that adapts various foundational models for the 3D point cloud segmentation task.
… making initial predictions of 2D semantic masks using different large vision models.
… then project these mask predictions from various frames of RGB-D video sequences into 3D space.
To generate robust 3D semantic pseudo labels, … introduce a semantic label fusion strategy that effectively combines all the results via voting.
… demonstrate the effectiveness of adopting general 2D foundation models on solving 3D point cloud segmentation tasks.
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b