Segment scene with unknown objects by enhance localization capabilities of CLIP with NACLIP

--

Segment scene with unknown objects by enhance localization capabilities of CLIP with NACLIP

Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation
arXiv paper abstract https://arxiv.org/abs/2404.08181
arXiv PDF paper https://arxiv.org/pdf/2404.08181.pdf
GitHub https://github.com/sinahmr/NACLIP

… vision-language … models, such as CLIP, have … effectiveness in … zero-shot image-level tasks … work has investigated … these models in open-vocabulary semantic segmentation (OVSS).

However, existing approaches often rely on impractical supervised pre-training or access to additional pre-trained networks.

… propose a strong baseline for training-free OVSS, termed Neighbour-Aware CLIP (NACLIP), representing a straightforward adaptation of CLIP tailored for this scenario.

… enforces localization of patches in the self-attention of CLIP’s vision transformer which, despite being crucial for dense prediction tasks, has been overlooked in the OVSS literature.

By … choices favouring segmentation, … improves performance without … additional data, auxiliary pre-trained networks, or extensive hyperparameter tuning

… Experiments are performed on 8 popular semantic segmentation benchmarks, yielding state-of-the-art performance on most scenarios …

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

Photo by Miguel Joya on Unsplash

--

--

AI News Clips by Morris Lee: News to help your R&D
AI News Clips by Morris Lee: News to help your R&D

Written by AI News Clips by Morris Lee: News to help your R&D

A computer vision consultant in artificial intelligence and related hitech technologies 37+ years. Am innovator with 66+ patents and ready to help a firm's R&D.

No responses yet